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This addendum describes additions and corrections to the PowerPC 601 RISC 
Microprocessor User's Manual. For convenience, tliis information is organized according 
to the chapters in the user's manual; however, there is no attempt to provide a 
comprehensive hst of text that is affected by this information. These changes will be 
incorporated in the first revision of the user's manual. 

Section 1 : Chapter 1 , "Ove'rview" 

This section describes additional information and corrections to Chapter 1, "Overview." 

Section 
Number 

1 . 1 .5 In tlie first sentence in this section replace Terabyte with Petabyte. 

1.3.8.4 Replace Figure 1-6 with the following: 
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Figure 1-6. MPC601 Signal Groups 

1 .3.6.2 Replace the last sentence of the fourth paragraph with tlie following: 

The processor easures that the ITLB is consistent with the UTLB, and uses an 
LRU replacement algorithm when a miss is encountered. 



Section 2: Chapter 2, "Registers and Data Types" 

This section describes additional information and coiTections to Chapter 2, "Registers and 
Data Types." 



2.1 



Replace Figure 2-1 with tlie following: 
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' MPC601-only registers. These registers are not necessarily supported by other PowerPC processors. 

^ These registers may be implemented differently on other PowerPC processors. The PowerPC architecture defines two sets of 

BAT registers — eight IBATs and eight DBATs.The MPC601 implements the IBATs and treats them as unified BATs. 
^ The RTCU and RTCL registers can be written only in supen/isor mode, in which case SPR20 and SPR21 are used. 
* The DEC register can be read by user programs by specifying SPRS in the mfspr instruction (for POWER compatibility). 

Figure 2-1. Programming Model— Registers 
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2. 1 The mechanism referred to for accessing SPRs is the set of Move to/from SPR 

instructions (mtspr and mfspr). These instructions are commonly used to 
access certain registers, while other SPRs may be more typically accessed as 
the side effect of executing other instructions. 

2. 1 The MSR register is 64 bits wide in 64-bit implementations and is 32 bits wide 

in 32-bit implementations. 

2.2.4.1 Replace the first paragraph in this section with the following: 

In most integer instructions, when the Re bit is set, the first three bits in CRO 
are set by an algebraic comparison of the result to zero; the fourth bit of CRO 
is copied from the XER[SO] bit. The addic, andi. and andis. instructions set 
these four bits implicitly. These bits are interpreted as follows — if any portion 
of the result (the 32-bit value placed into the target register) is undefined, the 
value placed into the first three bits of CRO is undefined. 

2.2.5 The mechanism referred to for accessing SPRs is the set of Move to/from SPR 

instructions (mtspr and mfspr). These instructions are commonly used to 
access certain registers, while other SPRs may be more typically accessed as 
the side effect of executing other instructions. 

2.2.5.3 In user-level access, RTCU and RTCL are read-only. The SPR numbers for the 
RTCU and RTCL differ depending upon whether the mtspr or mfspr 
instruction is used. For the mfspr instruction, RTCU is SPR4 and RTCL is 
SPR5. For the mtspr instruction, RTCU is SPR20 and RTCL is SPR21 
(supervisor-level access only). 

2.3.3 The MSR is not an SPR and should not be included in Table 2-15. 

2.3.3 The RTCU and RTCL registers can be written to only in supervisor mode and 

the mtspr instruction requires a different SPR encoding. For the mtspr 
instruction, RTCU is SPR20 and RTCL is SPR21. 

2.3.3.4 The PowerPC architecture defines tlie DEC register as supervisor-only access 
for both reads and writes. SPR22 is used for both reads and writes. Tlie 
POWER architecture provides user-level read access, using SPR6. To ensure 
compatibility with subsequent PowerPC processors, the mfspr instruction 
should not be used in user-level. 

2.3.3.10 The PVR is a read-only register that cannot be modified. 

2.3.3.12.1 The HIDO register is set to x'8001 0080' by the hard reset operation. However, 
the state of the EMC bit depends on the results of the power-on diagnostics for 
the main cache array. This bit is set if the cache fails the built-in self test during 
the power-on sequence. 

2.3.3. 1 2. 1 Checkstop enable bits can be set or cleared without restriction. If a checkstop 
source bit is set, it can be cleared; however, if the corresponding checkstop 
condition is still present on tlie next clock, the bit will be set again. A checkstop 
source bit can only be set when the corresponding checkstop condition occurs 
and the checkstop enable bit is set; it cannot be set via an mtspr instruction. 
That is, you cannot manually cause a checkstop condition. 

ADDENDUM-4 Addendum to the PowerPC 601 RISC Microprocessor User's Manual MOTOROLA 



2.3.3. 12.2 Note that when HID 1 [8-9] = 10, the trap address of x'2000' has a base address 
indicated by tlie setting of MSR[IP]. This mode is valid for address 
comparisons and may produce unpredictable results when used with the HID 
single-instruction step mode. 

2.4.3 Replace sentence 2 of paragraph 2 with the following: 

All transfers of individual scalars between registers and storage are of double 
words. A subset of tlie 64-bit scalar (e.g., a byte) is not addressable in storage. 
As a result, to access any subset of the bits of a scalar, the entire 64-bit scalar 
must be accessed, and when a storage location is read, die 64-bit value returned 
is tlie 64-bit value last written to tliat location. 

2.4.3 The following example shows how the byte ordering is changed from big- to 

litde-endian mode by setting HID0[28] (n refers to the address): 

<msr[ee] is off (zero) > 



n 


sync 


llnstructions 


n+4 


sync 


laccessed in 


n+8 


sync 


Ibig-endian mode 


n+c 


mtspr hid0(28) 


1 


n+10 


sync 


llnstructions 


n+14 


sync 


laccessed in 


n+18 


sync 


llittle-endian mode 



The same instruction sequence can be used to go from litUe- to big-endian 
mode by clearing HID0[28]. 

Little-Endian Address Manipulation 

In litde-endian operauons, the diree least significant bits of an address are 
manipulated as described in Chapter 2, "Registers and Data Types," to provide 
the appearance of a little-endian memory to the program for aligned loads and 
stores, as follows: 

New_addr(29) <- EA(29) xor (word I half I byte) 

New_addr(30) <- EA(30) xor (half I byte) 

New_addr(31) <- EA(31) xor (byte) 

The physical address used for a access generated by a load or a store to an 
operand Uiat is less than a double word is modified as indicated. Addresses for 
aligned double word accesses and cache control operations are not modified 
since the endiaii mode has no effect on aligned accesses greater tlian one word. 

The DAR and SRRO will contain the program address (or the next sequential 
address, as appropriate) after all exceptions. If the processor is in litde-endian 
mode, it will be a modified address. If die processor is in big-endian mode, the 
address is unmodified. 
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The T bit does not affect address manipulation or the detection of aUgnment 
exception conditions. Tlierefore I/O interface controller operations and BUID 
x'07F segments receive the modified address. The ecowx and eciwx 
instructions are treated as no-ops if the T bit is set regardless of whether the 
MPC601 is in little-endian mode. 

Because the MPC601 defines a cache block as 32 bytes, bits 27-31 of the 
address are not used for snooping. The program address should be specified, 
when an address is loaded into HID2 or HID5. That is, if the processor is in 
little-endian mode, a little-endian address should be specified, and if the 
processor is in big-endian mode, a big-endian address should be specified. 

Little-Endian Alignment Exceptions 

Additional alignment exception conditions can occur when the processor is in 
little-endian mode. 

Load/store multiple operands (regardless of EA) 

• Imw stmw 

• Iscbxx stswi 

• Iswi stswx 

• Iswx 

The new alignment exception conditions are prioritized with other alignment 
exceptions ahead of data access exceptions. For more information see Section 
2.4.6.2 "MisaHgned Scalars." 

Little-Endian Instruction Fetching 

In little-endian mode, instructions are fetched in big-endian order; however, 
the instructions are swapped within a double word before being passed to the 
instruction queue, thus putting the instructions in little-endian order for 
execution. On exceptions, the MPC601 reports the correct effective address (as 
defined by the programming model or computed by a storage access 
instruction) regardless of the endian mode selected. 

2.4.4 The second line of the program example is incorrect. Replace 
double b: /* x'212223242225262728* doubleword */ 
with the following: 

double b: /* x'2 122232425 262728' doubleword*/ 

2.4.5 The MPC601 big- and little-endian mode operation differs from the PowerPC 
architecture in the following ways: 

• Choice of big- or little-endian modes is provided through HIDO[LM] — ^bit 28 
of HIDO. The PowerPC architecture defines two bits in the MSR for this 
purpose. 
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• The basic mode switching sequence requires tliree sync instructions followed 
by the mtspr access to HID0[28], followed by three more sync instructions. 
This sequence should be used whenever the state of this bit is changed. 

• Extemal and decrementer interrupts should be disabled before executing the 
sequence. 

• The starting address of the sequence does not matter; however, the sequence 
camiot cross a protection boundary. 

• In some cases the mtspr access to HIDO[LM] can occur twice depending on 
the alignment of the instruction. 

• In some cases not all of the sync instructions will actually be executed, 
depending on the starting address of the sequence. 

• Although HIDO[LM] can be switched dynamically, there are certain 
constraints (such as turning off translation and emptying the memory queues) 
that must be considered before the bit can be switched. Note that, when 
switching modes between tasks, this code sequence may not allow the 
MPC601 to operate at an optimal performance level. 

Section 3: Chapter 3, "Addressing Modes and 
Instruction Set Summary" 

This section describes additional information and corrections to Chapter 3, "Addressing 
Modes and Instruction Set Summary." 

Section 
Number 

3. 1 .2 The first sentence in this section should include the isync instmction but should 
not include the mtmsr instruction. 

3.3.4.2 In Table 3-9, in tlie descriptions of the Shift Left Word, Shift Right Word, and 
Shift Right Algebraic Word instructions the number of bits specified by rB 
should be rB(27-31) instead of rB(26-31). 

3.4.3 The PowerPC architecture specifies that for the two floating-point convert to 
integer instructions, fctiw and fctiwz, both FPSCR(VXCVI) and 
FPSCR(VXSNAN) are set when the input operand is an SNaN. Tlie MPC601 
sets only FPSCR(VXCVI). 

3.5.5 Add the following information: 

In future implementations, the load/store multiple instructions are likely to 
have greater latency and take longer to execute, perhaps much longer, Uian a 
sequence of individual load or store instructions that produce the same results. 
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3.5.6 Add the following infomiation: 

In future implementations, tlie integer move string instructions are likely to 
have greater latency and take longer to execute, perhaps much longer, than a 
sequence of individual load or store instructions tliat produce the same results. 

3.5.7 Add the following information: 

The paired use of the Iwarx and stwcx. instructions allows programmers to 
emulate common semaphore operations such as test and set, compare and 
swap, exchange memory, and fetch and add. 

The concept beliind this part of the architecture is that a processor may load a 
semaphore from storage, compute a result based on the value of tlie semaphore, 
and conditionally store it back to tlie same location. The conditional store is 
performed based upon the existence of a reservation established by the 
preceding Iwarx instruction. If tlie reservation exists when the store is 
executed, the store is performed and a bit is set in the condition register. 

If the reservation does not exist when tlie store is executed, the target storage 
location is not modified and a bit in the condition register is cleared. The Iwarx 
and stwcx. primitives allow software to read a semaphore, compute a result 
based on the value of the semaphore, store the new value back into the 
semaphore location only if that location has not been modified since it was first 
read, and determine if the store was successful. 

If the store was successful, tlie sequence of instructions from the read of the 
semaphore to the store that updated the semaphore appear to have been 
executed atomically (i.e., no other processor or mechanism modified the 
semaphore location between the read and tlie update), thus providing the 
equivalent of a real atomic operation. However, otlier processors may have 
read from the location during tliis operation. 

The reservation set by an Iwarx instruction can be cleared by the following 
conditions: 

• The processor having the reservation executes a store conditional instruction 
to any address. 

• Another device executes any store instruction to any address in the 32-byte 
sector associated with the reservation. 

• The processor with the reservation takes any exception. 

• The processor with tlie reservation executes an sc instruction. 

• The processor with the reservation executes a trap instruction that takes a 
program exception. 

3.5.7 In a uniprocessor system, a program that modifies instructions it intends to 

execute must execute an isync instruction to ensure that all modifications are 
made visible to the instruction queue. Note that additional instructions are 
required for oUier PowerPC processors. 
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3.5.10.1 Replace tlie first sentence of tliis section witli the following: 

The steps for converting a floating-point value from the double-precision 
register format to single-precision memory format are as follows: 

3.6.1.1 Note that tlie LI field is appended with b'OO' prior to the addition. 

3.6.1.2 Note diat tlie BD field is appended with b'OO' prior to the addidon. 

3.6.1.3 Note diat tlie LI field is appended widi b'OO' prior to the addition. 

3.6.1.4 Note that the BD field is appended with b'OO' prior to the addition. 

3.6.3 The PowerPC architecture defines the bcctr instruction widi the "decrement 
and test CTR" (BO2 = 0) option as an invalid form, and attempting to execute 
such an instruction causes boundedly undefined results. However, the MPC601 
tests die count register for and branches based on the result. Instruction 
fetching is directed to die address specified in the non-decremented version of 
the count register. 

3.7. 1 The RTCU and RTCL registers can be read in user level and can be written to 

in supervisor level. The SPR encodings for reading the RTCU and RTCL 
registers are 4 and 5, respectively (regardless of whether the processor is in 
user- or supervisor level). The SPR encodings for writing RTCU and RTCL are 
20 and 21, respectively. 

3.8.4 The essence of die tibie instruction may be broadcast onto the MPC601 bus 
interface. This function is enabled by setting HID1[17]. 

Section 4: Chapter 4, "Cache and Memory Unit 
Operation" 

This section describes additional information and corrections to Chapter 4, "Cache and 
Memory Unit Operation." 

4.8 Delete the next to die last paragraph in diis section. 

4.8.8 Replace die second paragraph with the following: 

The dcbi instruction cannot be used to invalidate instructions in the cache of 
the MPC601. This instruction may have the effect of unmodifying data storage 
depending upon timing, exceptions, and other events. 

4. 1 1 In row #12 on page 4-28, "Four-beat write (quadword 2)" should update the 

sector status. The Current State column correcdy contains an x but the Next 
State Column specifies no change. The next state should be M. 
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Section 5: Chapter 5, "Exceptions" 

This section describes additional information and corrections to Chapter 5, "Exceptions." 



Section 
Number 

5.1 



5.1.1.3 



5.4.4 



5.4.5 



5.4.6 



The last bullet in the entry for the instruction access exception in Table 5-1 
should read as follows: 

If the K bits in tlie segment register and the PP bits in the PTE or BAT are set 
to prohibit read access, instructions cannot be fetched from this location. 

The second bulleted item in tliis section should read as follows: 

SRRO addresses eitlier tlie instruction tliat would have completed or some 
instruction following it that would have completed if the exception had not 
occurred. 

The following information should replace the appropriate bit descriptions for 
SRRl in Table 5-12. 

3 Cleared. Note that the PowerPC architecture defines this as set if the fetch access was to an I/O 
controller interface segment (SR[T]=1). Note that this condition causes SRR1[0-15] to be cleared 
intheMPC601. 

10 Cleared 

In early versions of the MPC601 (processor revision level x'OOOO'), the external 
interrupt is a level-sensitive signal and should be held active until reset by the 
interrupt service routine. Phantom interrupts due to phenomena such as 
crosstalk and bus noise should be avoided. 

In later versions of the MPC601 (processor revision level x'(X30r and higher), 
the MPC601 is guaranteed to detect an external interrupt when tlie INT signal 
is held active for at least two clock cycles. The MPC601 is guaranteed to ignore 
the INT signal if it is held for less than one clock cycle. 

The first DSISR value listed in Table 5-14 should be as follows: 

000000000000 00 01 0101 ttttt ????? 

The following should be added to Table 5-14: 



DAR 



Set to the EA of the data access as computed by the instruction causing the 
alignment exception. 



5.4.6.1.2 



5.4.10 



Replace the third bulleted item with the following: 

The Iwarx/stwcx./lscbx instructions that map into an I/O controller interface 
segment always cause a data access exception. However, if the instruction 
crosses a segment boundary an ahgnment exception is taken instead. 

Note also that unlike exceptions that occur with memory accesses, loads and 
both loads and stores with update to I/O controller interface segments cause the 
target register to be updated, regardless of whether an exception is taken. 
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5.4,12 Replace tliis section with tlie following: 

Run Mode/Trace Exception (x'02000') 

The MPC601 defines an implementation-specific exception called the run mode exception. 
This exception is taken by the MPC601 under the following circumstances: 

• Instruction address compare 

• Branch target address compare 

• Trace mode (MSR[SE] is set) — When an instruction clears MSR[SE], trace mode 
ends immediately. Note that other PowerPC processors implement a separate trace 
exception at vector x'OODOO'. 

Note that tliis exception may not be implemented by otlier PowerPC processors, and that 
tliis exception can be enabled and disabled using bits 8 and 9 in HIDl; the exception is 
enabled when HIDl [8,9] = b'Ol'. When this exception occurs, tlie registers are set as 
indicated in Table 5-24. 

Table 5-24. Run Mode Exception— -Register Settings 



Register 


Setting 


SRRO 


Set to the address of the instruction that causes the run mode exception 


SRR1 


Loaded from bits 0-31 of the MSR 


MSR 


EE SE 

PR FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 



The run mode is determined by the settings of HIDl [1-3]. These settings are defined in 
Table 5-25. 



Table 5-25. Run Modes Setting 


HIDl (1-3) Setting 


Run Mode 


000 


Normal run mode 


001 


Undefined. Do not use. 


010 


Limited instruction address compare. 


oil 


Undefined. Do not use. 


100 


Single instruction step 


101 


Undefined. Do not use. 


110 


Full instruction address compare 


111 


Full branch target address compare 
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Table 5-26 describes tlie run modes. 



Table 5-26. Run Modes Description 



Mode 


Description 


Normal run mode 


No address breakpoints are specified and the MPC601 processes zero to 
three instructions per cycle. 


Single instruction step mode 


In single instruction step mode, the f etcher processes one instruction at a 
time. After an instruction is processed and the chip quiesces, the appropriate 
break action is performed. Note that this mode is distinct from the trace 
exception, which depends on the setting of MSR[SE]. 


Limited instruction address 
compare mode 


The MPC601 runs at full speed until the EA of the instruction in the lowest 
position in the instruction queue (IQO) matches the one specified in HID2. At 
this point the appropriate break action is performed. This is a limited compare 
in that branches and floating-point operations and the addresses associated 
with them may never be detected. 


Full instruction address 
compare mode 


In full instruction address compare mode, processing proceeds out of IQO. 
When the EA in H1D2 matches the EA of the instruction in IQO, the 
appropriate break action is performed. Unlike the limited instruction address 
compare mode, all instructions pass through the IQO in this mode. That is, 
instructions cannot be folded out of the Instruction stream. 


Full branch target address 
compare mode 


This mode is similar to full instruction address compare mode except that the 
branch target is compared against H1D2. When addresses match, the 
appropriate break action is taken. This allows the programmer to see how a 
program got to an address. This mode can be used with b, be, bar, and bcc 
instructions. 



Wlien tlie trace exception is enabled, (MSR[SE] is set), a trace interrupt is taken after each 
instruction that completes without causing an exception or context change (such as an sc, 
rfi, or a load instruction that causes an exception). MSR[SE] is cleared when the trace 
exception is taken. In the normal use of tliis function, MSR[SE] is restored when the 
exception handler returns to the interrupted program using an rfi instruction. 

Register settings for the trace mode are described in Table 5-27. 

Table 5-27. Trace Exception— Register Settings 



Register 


Setting 


SRRO 


Set to the address of the next instruction to be executed in the 
exception was generated 


program for which the trace 


SRR1 


0-15 Cleared 

16-31 Loaded from bits 16-31 of the I^SR 


MSR 


EE SE 

PR FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 
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Wlien a run mode or trace exception is taken, instruction execution resumes as offset 
x'02000' from the base address indicated by MSR[EP]. 

Section 6: Chapter 6, "Memory Management Unit" 

This section describes additional information and corrections to Chapter 6, "Memory 
Management Unit." 

6. 1 .8 The first two bullet items under constraints enforced for instruction prefetcliing 

should be deleted. 

6.7.6 The dcbt/dcbtst instruction branch of Figure 6-10 should say "Abort Access" 

instead of "Abort Translation." 

6.9. 1 .5.2 The Hash Value 2 shown in Figure 6-22 shows an extra 4 bits (1111) that should 
be deleted. The Hash Value 2 should be replaced with the following: "101 1000 
0000 00011001." 

6.9.2 Steps 1 and 5 of the page table search operation imply that PTEs are read into 

the processor as single-beat read operations. In reality, the MPC601 performs 
a burst read operation at tlie PTE address (to load a sector of tlie on-chip cache) 
and optionally performs a second burst read operation to fill the cache line. 
However, the PTEs are read from the cache and compared with the virtual 
address infomiation one at a time. 

6.9.2 At tlie top of Figure 6-23, the fetch of a PTE is described as a single-beat read 

from PA. This should be replaced with a box showing that four PTEs are burst 
in at once, and four more PTEs may be burst in to fill a cache line. 

Section 7: Cliapter 7, "Instruction Timing" 

This section describes additional information and corrections to Chapter 7, "Instruction 
Timing." 



7. 1 Replace Figure 7-1 with the following: 

CLOCK 



CLOCK 1 



(STAGE 1) A (STAGE 2) 




(STAGE 1) B (STAGE 2) A (STAGE 3) 



(STAGE 3) 



CLOCK 2 (STAGE 1) C (STAGE 2) B (STAGES) A 



CLOCK 3 



(STAGE 1] 



(STAGE 2) C (STAGE 3) B 



Figure 7-1. Pipelined Execution Unit 
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7.3.3 Replace the last two paragraphs in this section with the following: 

Double-precision floating-point multiply instructions spend multiple clock 
cycles in the decode and execute stages of the FPU. However, the fmul 
instruction is broken down into two parts (which in the FPU pipeline appear to 
be two instructions). This allows the instruction to occupy two stages in the 
FPU simultaneously. The first part of the instruction can begin FPU execute 1 
stage as the second part enters the decode stage. Likewise, when the first part 
of the instruction enters FPU execute 2 stage, the second part enters execute 1 
stage. 

This self-pipelining reduces the latency to five cycles and improves the 
throughput. For example, a series of fmul instructions would have a 
throughput of one instruction every two cycles. 

Section 8: Chapter 8, "Signal Descriptions" 

This section describes additional information and corrections to Chapter 8, "Signal 
Descriptions." 

Section 
Number 

8. 1 Replace Figure 8-1 with Figure 1-7 (shown on page Addendum-2 of this 

addendum). 

8.2.4.1.2 Replace the first two entries in Table 8-1 with the following: 



TTO 



Special operations: This signal is asserted whenever a bus transaction is run in response to a 
Iwarx/stwcx. instruction pair, a TLBI (translation lookaside buffer invalidate) operation, or either an 
eciwx orecowx instruction. 



TT1 



Read (or write) operations: This signal indicates whether the transaction is a read (TT1 high) or a write 
(TT1 low). This assumes that the transaction is not address-only. 



8.2.4.2. 1 Replace the last paragraph of the State Meaning section with the following: 

For external control instructions (eciwx and ecowx), TSIZ0-TSIZ2 are used to 
output bits 29-3 1 of the external access register (EAR), which are used to form 
the resource ID (TBSTIITSIZ0-TSIZ2). 

8.2.4.3.1 Replace the first paragraph of the State Meaning section with the following: 

Asserted — Indicates that a burst transfer is in progress. 
8.2.4.4 Replace the first sentence widi the following: 

The transfer code (TCO-TCl) consists of two output signals on the MPC601. 
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Replace the first entry in Table 8-4 with the following: 




Assertion depends on whether the current transaction is a read or write operation; therefore, TCO 
should be used withTTI . On a read operation, TCO asserted indicates the transaction is an instruction 
fetch operation; otherwise, the read operation is a data operation. 

Asserting TCO for write operations indicates the cache sector associated with a write is being 
invalidated; TCO negated indicates the cache sector associated with a write is nof being invalidated. 



8.2.4.6 Substitute the following for the State Meaning entry: 

Asserted — Indicates that a single-beat transaction is write-through, reflecting 
the value of the W bit for the block or page that contains the address of the 
current transaction. For burst writes, this indicates that the write is the result of 
a dcbf or dcbst instruction. 

Negated — Indicates that a transaction is not write-through. For bursts it is 
negated for cast-outs and snoop pushes. 

8.2.4.9 Add the following sentence to the first paragraph: 

This pin must be enabled by setting HID0[31] if it is to be used. 

The Timing Comments should read as follows: "Assertion/Negation — Must be 
valid throughout the entire address tenure." 

8.2.6.1 Substitute the following for the Asserted information in the State Meaning 
section: 

Asserted — ^Indicates that the MPC601 may, with the proper qualification, 
assume mastership of the data bus. The MPC601 derives a qualified data bus 
grant when DBG is asserte d and T5BH, DRTRY, andARTRY are negated; that 
is, the data bus is not busy (DBB is negated), there is no outstanding attempt to 
retry the current data tenure (DRTRY is negated), and there is no outstanding 
attempt to perform an ARTRY of the associated address tenure. 

8.2.7.2.2 Substitute the following for the next-to-last sentence in the State Meaning 
section: 

Detected even parity causes a checkstop if data parity errors are enabled in the 
HID register. 

8.2.9.1 Replace the First paragraph of the State Meaning section with the following: 

Asserted— The MPC601 latches the interrupt condition if the MSR(EE) bit is 
set and ignores the interrupt condition otherwise. To guarantee that the 
MPC601 will take the external interrupt, the INT pin must be held active until 
the MPC601 takes the interrupt; odierwise, the MPC601 may or may not take 
an ex ternal interrupt, depending on whether MSR[EE] bit was set while tlie 
INT signal was held active. 

8.2.9.6 Add tlie following sentence to the first paragraph: 

Note that systems that do not use this signal should tie it low. 

8.2.11 This section should be deleted. 
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8.2,12,3 Replace Figure 8-6 witli tlie following: 

|0|1|2|3|4|5|6|7|8|9|10|11| 
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*Deiay of inverter output 
Figure 8-6. Generation of Bus Transactions— Logical Bus CIocl< = 1/2 P_CLK 

Section 9: Chapter 9, "System Interface Operation" 

9.1 ,2 Replace the last sentence in the second bulleted item with tlie following: 

The update of the other sector can be disabled by setting bits in the HIDO 
register. HIDO[DRF], bit 26, can be used to disable fetches and HIDO[DRL], 
bit 27, can be used to disable loads and stores. 

9.2 Replace the second paragraph witli tlie following: 

Figure 9-3 shows that the address and data tenures are distinct from one 
another and that both consist of three phases — arbitration, transfer, and 
termination. Address and data tenures are independent (indicated in Figure 9-3 
by fact that the data tenure begins before the address tenure ends), which 
allows spht-bus transactions to be implemented at the system level in 
multiprocessor systems. Figure 9-3 shows a data transfer that consists of a 
single-beat transfer of as many as 64 bits. Four-beat burst transfers of 32-byte 
cache sectors require data transfer termination signals for each beat of data. 

9.3.2.1 Delete the second paragraph. 

9.3,2.3 Substitute the following row in Table 9-3. 



First transfer: 
two bytes 


010 


110 


— 


— 


— 


— 


— 


— 


A 


A 



9,3,2,4 Replace the last sentence of the second paragraph with the following: 

TCO negated indicates the write is not invalidating any cache sector (for 
example, write-through or cache-inhibited write operations,) 
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9.3.3 The second sentence of the second paragraph should read as follows: 

After AR'l'RY and SHD are asserted, they will be three-stated for two bus 
cycles and tlie system is responsible for precharging both ARTRY and SHD 
signals. 

9.3.3 Add tlie following sentence to the end of the fourth bulleted item: 

The override mode uses the HP_SNP_REQ signal to determine if the snoop 
queue is to be used. This mode is enabled by setting HID0[31]. 

9.4.2 Replace the first sentence of the third paragraph with the following: 

The type of transaction initiated by the MPC601 depends on whether tlie code 
or data is cacheable and, for store operations, whether the cache is operated in 
write-back or write-through mode which software controls at either the page or 
block basis. 

9.4.3 Replace the last sentence with the following: 

ARTRY can also terminate a data bus transaction. For burst transactions, this 
ArTRY must occur no later than the cycle of the second TA. For single-beat 
transactions, it must occur no later than the cycle following TX. In either case, 
the ARTRY must be for the address bus tenure associated with the data bus 
tenure. 

9.4.3.1 Replace Figure 9-10 with the following: 



T^ V 



qual 




u 



TV 







Figure 9-10. Normal Single-Beat Read Termination 
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Replace Figure 9-11 with the following: 
I 1 
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ta 
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Figure 9-11. Normal Single-Beat Write Termination 

9.5 The clock signals at the bottom of tlie figures in this section should be ignored. 

Replace the first two paragraphs with the following: 

This section shows timing diagrams for various scenarios. Figure 9-16 
illustrates the fastest single-beat reads. This figure shows both minimal latency 
and maximum single-beat throughput. By delaying the data bus tenure, the 
latency increases, but, because of split-transaction pipelining, the overall 
throughput is not affected unless the data bus latency causes the tliird address 
tenure to be delayed. 

Note tliat all bidirectional signals go to high-impedance between bus tenures. 

9.6.4 Replace the last sentence in the fourth paragraph with tlie following 

The MPC601 involved in this transaction, however, does not initiate any otlier 
I/O controller load or store operations once the first I/O controller interface 
operation has begun address tenure; however, if the I/O operation is retried, 
other higher-priority operations can occur. 

9.6.4 Replace the last sentence of the last paragraph with the following: 

If the TEA signal is not asserted with each tenure of a given I/O controller 
interface operation, the result of the assertion of TEA is unpredictable. The 
MPC601 may take a macliine check exception or cause a checkstop condition. 

9. 1 Add the following note after step 5 : 

Note that steps 4 and 5 can occur in either order. 
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Section 10: Chapter 10, "Instruction Set" 

Tliis section describes additional information and corrections to Chapter 10, "Instruction 
Set." 

Imw Add tlie following information: 

In future implementations, tliis instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, tlian a sequence of individual load 
instructions tliat produce tlie same results. 

Ifsu The reference in tlie description of this instruction should be to Section 3.5.9. 1 , 

"Double-Precision Conversion for Floating-Point Load Instructions." 

Ifsux The reference in tlie description of diis instruction should be to Section 3.5.9.1 , 

"Double-Precision Conversion for Roating-Point Load Instmctions." 

Ifsx The reference in the description of this instruction should be to Section 3.5.9.1 , 

"Double-Precision Conversion for Floating-Point Load Instmctions." 

Iswi Add the following information: 

In future implementations, tliis instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual load 
instructions that produce the same results. 

mffs This instruction is executed in tlie FPU raUier tlian tlie lU, 

Iswx Add the following information: 

Under certain conditions (for example, segment boundary crossings) the 
alignment error handler may be invoked. For additional information about 
alignment exceptions, see Section 5.4.6, "Alignment Exception (x'00600') 

In future implementations, this instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, dian a sequence of individual load 
instructions that produce die same results. 

mfspr Replace Table 10-4 with the following: 





Table 10-4. SPR Encodings for mfspr 




SPR 


Register Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-41 





00000 


00000 


MQ 


User 


1 


00000 


00001 


XER 


User 


4 


00000 


00100 


RTCU2 


User 


5 


00000 


00101 


RTCL2 


User 


6 


00000 


00110 


DEC^ 


User 


8 


00000 


01000 


LR 


User 


9 


00000 


01001 


CTR 


User 
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Table 10-4. SPR Encodings for mfspr (Continued) 



1 

SPR 


Register Name 


Access 


Decimal 


SPR[5-9] 


SPR[0^] 


18 


00000 


10010 


DSiSR 


Supervisor 


19 


00000 


10011 


DAR 


Supervisor 


22 


00000 


10110 


DEC^ 


Supervisor 


25 


00000 


11001 


SDR1 


Supervisor 


26 


00000 


11010 


SRRO 


Supervisor 


27 


00000 


11011 


SRR1 


Supervisor 


272 


01000 


10000 


SPRGO 


Supervisor 


273 


01000 


10001 


SPRG1 


Supervisor 


274 


01000 


10010 


SPRG2 


Supervisor 


275 


01000 


10011 


SPRG3 


Supervisor 


282 


01000 


11010 


EAR 


Supervisor 


287 


01000 


11111 


PVR 


Supervisor 


528 


10000 


10000 


BATOU 


Supervisor 


529 


10000 


10001 


BATOL 


Supervisor 


530 


10000 


10010 


BAT1U 


Supervisor 


531 


10000 


10011 


BAT1L 


Supervisor 


532 


10000 


10100 


BAT2U 


Supervisor 


533 


10000 


10101 


BAT2L 


Supervisor 


534 


10000 


10110 


BAT3U 


Supervisor 


535 


10000 


10111 


BAT3L 


Supervisor 


1008 


11111 


10000 


Checkstop Register (HIDO) 


Supervisor 


1009 


11111 


10001 


Debug Mode Register (H1D1) 


Supervisor 


1010 


11111 


10010 


lABR (HID2) 


Supervisor 


1013 


11111 


10101 


DABR (HID5) 


Supervisor 


1023 


11111 


11111 


PIR(HID15) 


Supervisor 


te that the c 


rder of the tw 


ro 5-bit halve 


s of the SPR number is reverj 


ed compared v 



actual instruction coding. 

If the SPR field contains any value other than one of these implementation-specific 
values or one of the values shown in Table 3-40, the instruction form is invalid. 
SPR[0]=1 if and only if the register is being accessed at the supervisor level. Execution of 
this instruction specifying a defined and supervisor-level register when MSR[PR]=1 
results in a privilege violation type program exception. 
For mtspr and mfspr instructions, the SPR number coded in assembly language does 
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not appear directly as a 1 0-bit binary number in the instruction. The number coded is split 
into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits 
appearing in bits 16-20 of the instruction and the low-order 5 bits in bits 11-15. 
SPR encodings for DEC, MQ, RTCL, and RTCU are not part of the PowerPC 
architecture. 

^On the MPC601, the mfspr instruction for the RTCU and RTCL registers must use these 
encodings (SPR4 and SPR5, respectively) regardless of whether the processor is in 
supervisor or user mode. The mtspr instruction, which is supervisor-only for the RTCU 
and RTCL registers, must use the SPR20 and SPR21 encodings, respectively. 

•^Read access to the DEC register is supervisor-only in the PowerPC architecture, using 
SPR22. However, the POWER architecture allows user-level read access using SPR6. 
Note that the SPR6 encoding for the DEC will not be supported by other PowerPC 
processors. 

mtspr Replace Table 10-5 witli tlie following: 

Table 10-5. SPR Encodings for mtspr 



sprI 


Register 
Name 


Access 


Decimal 


SPR[5-91 


SPR[0-4] 





00000 


00000 


MQ 


User 


1 


00000 


00001 


XER 


User 


8 


00000 


01000 


LR 


User 


9 


00000 


01001 


CTR 


User 


18 


00000 


10010 


DSISR 


Supervisor 


19 


00000 


10011 


DAR 


Supervisor 


20 


00000 


10100 


RTCU2 


Supervisor 


21 


00000 


10101 


RTCL^ 


Supervisor 


22 


00000 


10110 


DEC^ 


Supervisor 


25 


00000 


11001 


SDR1 


Supervisor 


26 


00000 


11010 


SRRO 


Supervisor 


27 


00000 


11011 


SRR1 


Supervisor 


272 


01000 


10000 


SPRGO 


Supervisor 


273 


01000 


10001 


SPRG1 


Supervisor 


274 


01000 


10010 


SPRG2 


Supervisor 


275 


01000 


10011 


SPRG3 


Supervisor 


282 


01000 


11010 


EAR 


Supervisor 


528 


10000 


10000 


BATOU 


Supervisor 


529 


10000 


10001 


BATOL 


Supervisor 
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Table 10-5. SPR Encodings for mtspr (Continued) 



sprI 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


530 


10000 


10010 


BAT1U 


Supervisor 


531 


10000 


10011 


BAT1L 


Supervisor 


532 


10000 


10100 


BAT2U 


Supervisor 


533 


10000 


10101 


BAT2L 


Supervisor 


534 


10000 


10110 


BAT3U 


Supervisor 


535 


10000 


10111 


BAT3L 


Supervisor 


1008 


11111 


10000 


Clieckstop 

Register 

(HIDO) 


Supervisor 


1009 


11111 


10001 


Debug Mode 

Register 

(HIDI) 


Supervisor 


1010 


11111 


10010 


lABR (HID2) 


Supervisor 


1013 


11111 


10101 


DABR (HID5) 


Supervisor 


1023 


11111 


11111 


PiR(HID15) 


Supervisor 



^Note that the order of the two 5-bit halves of the SPR number is reversed compared with 
actual instruction coding. 

If the SPR field contains any value other than one of these implementation-specific 
values or one of the values shown in Table 3-40, the instruction form is invalid. 
SPR[0]=1 if and only if the register is being accessed at the supervisor level. Execution of 
this instruction specifying a defined and supervisor-level register when MSR[PR]=1 
results in a privilege violation type program exception. 

For mtspr and mfspr instructions, the SPR number coded in assembly language does 
not appear directly as a 10-bit binary number in the instruction. The number coded is split 
into two 5-bit halves that are reversed in the instruction, with the high-order 5 bits 
appearing in bits 16-20 of the instruction and the low-order 5 bits in bits 11-15. 

SPR encodings for DEC, MQ, RTCL, and RTCU are not part of the PowerPC 
architecture. 

^On the MPC601 , the mfspr instruction for the RTCU and RTCL registers must use these 
encodings (SPR4 and SPR5, respectively) regardless of whether the processor is in 
supervisor or user mode. The mtspr instruction, which is supervisor-only for the RTCU 
and RTCL registers, must use the SPR20 and SPR21 encodings, respectively. 

^Read access to the DEC register is supervisor-only in the PowerPC architecture, using 
SPR22. However, the POWER architecture allows user-level read access using SPR6. 
Note that the SPR6 encoding for the DEC will not be supported by other PowerPC 
processors. 
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straw Add tlie following information: 

In future implementations, this instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual 
store instructions that produce the same results. 

stswi Add the following information: 

In future implementations, this instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual 
store instructions that produce the same results. 

stswx Add tlie following information: 

In future implementations, this instruction is likely to have greater latency and 
take longer to execute, perhaps much longer, than a sequence of individual 
store instructions that produce the same results. 

10.3 The tibiex instruction has been removed from the PowerPC architecture, and 

should be deleted from Table 10-6. 

Section 11 : Appendixes 

This section describes additional information and corrections to the appendixes. 

Section 
Number 

App. A The slbiex and tibiex instructions, which are not implemented in the MPC601 , 
have been removed from the PowerPC architecture. 

App. C The slbiex and tibiex instructions, which are not implemented in the MPC601, 
have been removed from the PowerPC architecture. 

App. D The MPC601 takes an illegal instruction error exception for instructions that 
the PowerPC architecture defines as reserved, except for those POWER 
instructions that are implemented on tlie MPC601. 

F.3.2 Replace Round Integer(frac,gbit,rbit,xbit,round_mode) with the following: 

Round Integer(frac,gbit,rbit,xbit,round_mode) 

In this example, u represents an undefined hexadecimal digit. Comparisons 
ignore the u bits. 

inc<-0 

If rouncl_mode= b'OO' then 
Do 
If sign II frac[64] II gbit II rbit II xbit = b'ul luu' then inc <— 1 
If sign II frac[64] II gbit II rbit II xbit = b'uOl lu' then inc <- 1 
If sign II frac[64] II gbit II rbit 11 xbit = b'uOlul ' then inc <- 1 
End 
If round_niode= b'lO' then 
Do 

If sign II frac[64] II gbit II rbit II xbit = b'Ouluu' then inc<-l 
If sign II frac[64] II gbit II rbit II xbit = b'Ouulu' then inc <- I 
If sign 11 frac[64] II gbit II rbit II xbit = b'OuuuI ' then inc <- 1 
End 
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If round_mode= b'll ' then 
Do 

If sign II frac[64] II gbit II rbit II xbit= b'luluu' then inc <- 1 

If signllfrac[64] II gbit llrbit II xbit= b'luulu' then inc <- 1 

If sign II frac[64] II gbit II rbit II xbit= b'luuul' then inc<- 1 
End 
frac[0-64] <- frac[0-64] + inc 
FPSCR[FR] <- inc 
FPSCR[n] <- gbit I rbit I xbit 
Return 

App, H The following appendix should be added to the user's manual. 

Appendix H 

MPC601 as a PowerPC Microprocessor 

Tlie MPC601 processor is tlie first implementation of the PowerPC architecture. It offers a 
reliable platform for software and hardware developers to make products compatible witli 
subsequent processors in the PowerPC family. In addition, the MPC601 provides 
extensions to the PowerPC architecture that allow it to function as a bridge from the 
POWER architecture. This appendix describes tlie POWER extensions as well as other 
differences between the MPC601 and the PowerPC architecture (PowerPC Architecture, 
First Edition). These differences can be categorized as follows: 

• POWER extensions — ^Additional functionality not defined in the PowerPC 
architectiu-e. For example, the MPC601 implements many POWER instructions that 
do not have PowerPC equivalents. 

• Variances — MPC601 functionality that is implemented differently than as described 
in the PowerPC architecture. For example, there are several differences between the 
MPC601 MMU implementation and Uiat specified by tlie PowerPC architecture. In 
general, tliese variances are not visible from tlie user level. 

• Implementation-dependent extensions — ^These include features that are not part of 
but are allowed by the PowerPC architecture. For example, the MPC601 provides a 
set of implementation-dependent registers (HIDs) to control hardware features such 
as parity checking and instruction address breakpoint tliat are beyond the 
specifications in architecture. Software should take appropriate precautions to 
control use of these features. 

• PowerPC optional features — ^These include optional features defined in the 
PowerPC architecture tliat are implemented in the MPC601. 

This appendix does not describe performance trade-offs allowed by the PowerPC 
architecture. For example, some implementations may provide more support for alignment 
tlian otliers and tlierefore they may require different amounts of assistance in tlie interrupt 
handler. 
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This appendix also does not describe variances built into tlie PowerPC arcliitecture to 
provide some latitude in PowerPC implementations for handling reserved, invalid, and 
undefined conditions. Tliese aspects are left intentionally undefined. Note tliat while die 
MPC601 's treatment of such aspects may be predictable, taking advantage of that behavior 
may cause software incompatibilities with other PowerPC implementations. 

Where applicable, a reference is given to tlie portion of the user's manual tliat describes that 
functionality. 

Also, tlie tables in this appendix indicate the level of architecture at which the MPC601 
diverges. Tliese levels are as follows: 

• PowerPC user instruction set architecture — Defines the base user-level instruction 
set, user-level registers, data types, floating-point exception model, memory models 
for a uniprocessor environment, and programming model for uniprocessor 
environment. 

The tables in this appendix identify differences with this part of the architecture by 
listing "User" in the "Level" column of the tables. 

• PowerPC virtual environment architecture — ^This describes the memory model for a 
multiprocessor environment, defines cache control instructions, and describes other 
aspects of virtual environments. Implementations that conform to the PowerPC 
virtual environment architecture also adliere to the PowerPC user instruction set 
architecture, but may not necessarily adhere to the PowerPC operating environment 
architecture. 

The tables in this appendix identify differences with this part of the architectiu^e by 
listing "Virtual environment" in the "Level" column of the tables. 

• PowerPC operating environment architecture — ^This defines the memory 
management model, supervisor-level registers, synchronization requirements, and 
the exception model. Implementations that conform to the PowerPC operating 
environment architecture also adhere to the PowerPC user instruction set 
architecture and the PowerPC virtual environment architecture definition. 

The tables in tliis appendix identify differences with this part of the architecture by 
listing "Operating environment" in tlie "Level" column of the tables. 

H.1 POWER Extensions 

Table H-1 lists POWER functionality supported by the MPC601 that is not defined in the 
PowerPC architecture. POWER extensions include additional functionality not defined in 
the PowerPC architecture. 
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Table H-1 . POWER Extensions 



Difference 


Reference 


Level 


Type 


The MQ register, provided for POWER 
compatibility, is not part of the PowerPC 
architecture. 


Section 2.2.5.1, "MQ 
Register (MQ)" 


User 


POWER 


The MPC601 implements the real-time clock 
feature, including POWER registers RTCU and 
RTCL, to provide a time reference rather than the 
time base feature defined by the PowerPC 
architecture. 


Section 2.2.5.3, "Real- 
Time Clock (RTC) 
Registers" 


Virtual and 

operating 

environment 


POWER and 
Variance 


In the MPC601 processor, the decrementer 
implementation uses the separate 7.8125 MHz 
RTC for its base frequency. Other PowerPC 
processors base the decrementer on the processor 
dock. 


Section 2.3.3.4, 
"Decrementer (DEC) 
Register" 


User and 

virtual 

environment 


POWER and 
Variance 


The decrementer register (DEC) in the MPC601 
allows user-level read access, which is not 
provided in the PowerPC architecture. 


Section 2.3.3.4, 
"Decrementer (DEC) 
Register" 


User and 
operating 
environment 


POWER 


Because MPC601 supports the POWER registers, 
MQ, RTCU, and RTCL, instmction encodings to 
access them, mtspr and mfspr, are also provided. 


Section 3.7.1, "Move 
to/from Special Purpose 
Register Instructions" 


User and 
operating 
environment 


POWER 


The MPC601 provides a group of instructions for 
compatibility with POWER. The relationship 
between the MPC601 and the PowerPC instruction 
sets are shown in Appendix C. The PowerPC 
architecture defines these instructions as reserved. 


Appendix C, "PowerPC 
Instructions Not 
Implemented in MPC601" 




POWER 



H.2 Variances 

Variances include PowerPC functionality tliat is implemented in the MPC601 with some 
differences. In general, these variances are not visible from the user level. Table H-2 lists 
variances to the PowerPC arcliitecture. 
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Table H-2. Variances 



Difference 


Reference 


Level 


Type 


The VXSOFT, VXSQRT. and Nl bits (bits 21, 22, 
and 29, respectively) are not implemented in the 
MPC601 processor. 


Section 2.2.3, "Floating- 
Point Status and Control 
Register (FPSCR)" 


User and 
operating 
environment 


Variance 


If the Floating-Point Convert to integer Word (fctiw) 
instruction results in a conversion exception, 
FPSCR[XCVI] is set causing FPSCR[VX] to be set. 
The PowerPC architecture specifies that when 
fctiw causes FPSCR[XCVI] to be set, FPSCR[XX] 
is not altered. The MPC601 may set both 
FPSCR[XX] and FPSCR[XCVI] in some 
circumstances. 


Section 3.4.3, "Floating- 
Point Rounding and 
Conversion Instructions" 


User and 
operating 
environment 


Variance 


The architecture requires that both FPSCR[VXCVI] 
and FPSCR[VXSNAN] be set when the source 
operand of a fctiw is an SNAN. MPC601 sets only 
FPSCR[VXCVI]. 


Section 3.4.3, "Floating- 
Point Rounding and 
Conversion Instructions" 


User and 
operating 
environment 


Variance 


PowerPC architecture defines the following bits in 
the machine state register (MSR) not implemented 
intheMPCbOl: 
Bit Description 


Section 2.3.1 , "Machine 
State Register (MSR)" 


Operating 
environment 


Variance 


13 Power management enable (POW) 
15 Interrupt little-endian mode (ILE) 
22 Branch trace enable (BE) 

30 Recoverable exception (RE) 

31 Little-endian mode (LE) 


The MPC601 provides a bit in an implementation- 
specific register (HIDO) for selecting between big- 
and little-endian modes. PowerPC architecture 
defines MSR[LE] for this purpose. 


Section 2.4.3, "Byte and 
Bit Ordering" 


Operating 
environment 


Variance 


The number, function, content, and format of the 
BAT registers implemented by the MPC601 is 
different than that specified by the PowerPC 
architecture. 


Section 2.3.3.11. "BAT 
Registers" 


Operating 
environment 


Variance 


The IVIPCeoi clears the reservation bit set by the 
execution of an Iwarx instruction when taking any 
type of exception. The PowerPC architecture 
defines that the reservation be cleared for a subset 
of exceptions. 


Section 3.5.7, "Memory 

Synchronization 

Instructions" 


User and 
operating 
environment 


Variance 


Because the MPC601 does not implement the 
MSR[RE] bit (recoverable exception bit), the 
operating system must use other criteria to 
determine if it is possible to recover from an 
asynchronous, imprecise interrupt. 


Section 5.4.1, "Reset 
Exceptions (x'00100')" 


Operating 
environment 


Variance 


Instruction access exceptions due to instruction 
fetches from I/O controller interface segments do 
not set the SRR1[3] bit as defined in the PowerPC 
architecture. This condition clears SRR1[0-15]. 


Section 5: "Chapter 5, 
"Exceptions" of this 
addendum 


Operating 
environment 


Variance 
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Table H-2. Variances (Continued) 



Difference 


Reference 


Level 


Type 


The non-IEEE mode bit, FPSCR[NI], is reserved in 
the MPC601 . All floating-point results are 
consistent with IEEE standards. 


Section 5.4.7.1 , "Floating- 
Point Enabled Program 
Exceptions" 


Operating 
environment 


Variance 


The MPC601 maps I/O controller interface error 
conditions to I/O controller interface exceptions 
instead of to the data access exception vector 
specified by the PowerPC architecture. 


Section 5.4.10, "I/O 
Controller Interface Error 
Exception (x'OOAOO')" 


Operating 
environment 


Variance 


Unlike exceptions that occur with memory 
accesses, loads, loads with update, and stores with 
update to I/O controller interface segments cause 
any target registers to be updated, regardless of 
whether an exception is taken. 


Section 5: "Chapter 5, 
"Exceptions" of this 
addendum 


Operating 
environment 


Variance 


The MPC601 does not implement the trace 
exception as a separate exception as is defined in 
the PowerPC architecture (x'OODOO'). The MPC601 
vectors trace exceptions to the run-mode/trace 
exception (x'02000'). 


Section 5.4.12, "Run 
Mode/Trace Exception 
(x'02000')" 


Operating 
environment 


Variance 


The MPC601 allows access to the I/O controller 
interface regardless of the setting of MSR[DT]. The 
PowerPC architecture does not allow these 
accesses when MSR[DT] is cleared. 


Section 6.1.3, "Address 
Translation Mechanisms" 


Operating 
environment 


Variance 


The MPC601 does not implement the PowerPC 
tibsync instruction, but instead requires the use of 
a sync instruction to synchronize the completion of 
a broadcast tibie instmction. 


Section 10.3, 
"Instructions Not 
Implemented by the 
MPC601" 


Operating 
environment 


Variance 


PowerPC architecture defines a "Guarded" 
memory attribute used to protect volatile memory. 
This attribute is associated wrth each virtual page 
(guarded bit in the page table entry) and with 
physical memory. The MPC601 provides a similar 
function using the "Caching Inhibited" memory 
attribute. 


Section 6.1.8, "Effects of 
Instruction Prefetch on 
MMU" 


Operating 
environment 


Variance 



H.3 Implementation-Dependent Extensions 

Implementation-dependent extensions include features that are not part of, but are allowed 
by, the PowerPC architecture. Note that there are a number of such extensions that are 
described throughout the user's manual, and there is no attempt to list them exhaustively in 
this appendix. Table H-3 provides a brief list of key implementation-dependent extensions 
supported by the MPC601. 
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Table H-3. Implementation-Dependent Extensions 



Difference 


Reference 


Level 


Type 


The MPC601 includes the following 
implementation-specific registers: 
HIDO. HID1 . HID2, HID5, HID15. These may or 
may not be included in future implementations. 


Section 2.3.3.12.1, 
"Checkstop Sources and 
Enables Register — 
HIDO," through Section 
2.3.3.12.5, "Processor 
Identification Register 
(PIR)— HID15" 


Operating 
environment 


Implementation 

-dependent 

extensions 


Because the MPC601 automatically handles all 
floating-point datatypes, the MPC601 floating-point 
assist exception defined in the PowerPC 
architecture (x'OOEOO') would never be taken by the 
MPC601, and therefore it is not implemented. 




Operating 
environment 


Implementation 

-dependent 

extension 


The MPC601 supports an additional exception 
called the run mode exception in addition to the 
exceptions defined by the PowerPC architecture. 


Section 5.4.12, "Run 
Mode/Trace Exception 
(x'02000')" 


Operating 
environment 


Implementation 

-dependent 

extension 


The MPC601 includes a feature that supports 256- 
Mbyte translation capability. This is enabled when 
the T-bit is set and the BUID field is = x'07F' in the 
appropriate segment register{s). 


Section 6.5.2.1, "I/O 
Controller Interface 
Address Translation: T=1 
in Segment Register" 


Operating 
environment 


Implementation 

-dependent 

extension 



H.4 Options to the PowerPC Architecture 

Table H-4 lists options to tlie PowerPC architecture supported by the MPC601. Note that 
because these are optional, tliey may not be supported by all PowerPC processors, just as 
the MPC601 does not support other optional features supported by the architecture. 

Table H-4. Options to the PowerPC Architecture 



Difference 


Reference 


Level 


Type 


The MPC601 processor implements the external 
access register (EAR), which is optional to the 
PowerPC architecture. Note that only four bits (28- 
31) are implemented in the MPC601, whereas the 
PowerPC architecture defines six bits. 


Section 2.3.3.9, "External 
Access Register (EAR)" 


Operating 
environment 


Optional 


The MPC601 implements the eciwx and ecowx 
instructions that are optional to the PowerPC 
architecture but are required for use with the EAR 
register. 


Section 3.9, "External 
Control Instructions" 


User 


Optional 
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About This Book 



The primary objective of this user's manual is to define the functionality of the MPC601 
microprocessor for use by software and hardware developers. The MPC601 processor is the 
tirst in the family of PowerPC™ microprocessors, and can provide a reliable foundation for 
developing products compatible with subsequent processors in the PowerPC family. 
However, the MPC6()1 provides a bridge between the POWER architecture and the 
PowerPC architecture, and as a result there are aspects of the MPC601 processor that are 
different from the PowerPC architecture. Therefore, a secondary objective of this manual 
is to describe how the MPC601 processor differs from the PowerPC architecture. 

The PowerPC architecture is comprised of the following components: 

• PowerPC user instruction set architecture — This includes the base user-level 
instruction set (excluding a few user-level cache-control instructions), user-level 
registers, programming model, data types, and addressing modes. 

• PowerPC virtual environment architecture — ^This describes the semantics of the 
memory model that can be assumed by software processes and includes descriptions 
of the cache model, cache-control instructions, address aliasing, and other related 
issues. Implementations that conform to the PowerPC virtual environment 
architecture also adhere to the PowerPC user instruction set architecture, but may 
not necessarily adhere to the PowerPC operating environment architecture. 

• PowerPC operating environment architecture — ^This includes the structure of the 
memory management model, supervisor-level registers, and the exception model. 
Implementations that conform to the PowerPC operating environment architecture 
also adhere to the PowerPC user instruction set architecture and the PowerPC virtual 
environment architecture. 

It is beyond the scope of the manual to provide a thorough description of the PowerPC 
architecture. It must be kept in mind that each PowerPC processor is a unique 
implementation of the PowerPC architecture. 

For readers of this manual who are concerned about compatibility issues regarding 
subsequent PowerPC processors, it is critical to read Chapter 1, "Overview," and in 
particular Section 1.3, "MPC601 as a PowerPC Implementation," which outlines in a very 
general manner the components of the PowerPC architecture, and indicates where and how 
the MPC6()1 diverges from the PowerPC detinition. Instances where the MPC6()1 differs 
from the PowerPC architecture are noted throughout the manual. 
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Audience 

This manual is intended for system software and hardware developers and applications 
programmers who want to develop products for the MPC601 microprocessor and PowerPC 
processors in general. It is assumed that the reader understands operating systems, 
microprocessor system design, and the basic principles of RISC processing. 

Organization 

Following is a summary and a brief description of the major sections of this manual: 

• Chapter 1, "Overview," is useful for readers who want a general understanding of 
the features and functions of the PowerPC architecture and the MPC601 processor. 
This chapter also provides a general description of how the MPC601 differs from 
the PowerPC architecture. 

• Chapter 2, "Registers and Data Types," is useful for software engineers who need to 
understand the PowerPC prograinming model and the functionahty of the registers 
implemented in the MPC6{)1 . This chapter also describes PowerPC conventions for 
storing data in memory. 

• Chapter 3, "Addressing Modes and Instruction Set Summary," provides an 
overview of the PowerPC addressing modes and a description of the instructions 
implemented by the MPC601 , including the portion of the PowerPC instruction set 
and the additional instructions implemented by the MPC6()1. 

Specific differences between the MPC601 implementation and the PowerPC 
implementation of individual instructions are noted. 

• Chapter 4, "Cache and Memory Unit Operation," provides a discussion of cache 
timing, look-up process, MESI protocol, and interaction with other units. This 
chapter contains information that pertains both to the PowerPC virtual environment 
architecture and to the specific implementation in the MPC601. 

• Chapter 5, "Exceptions," describes the exception model defined in the PowerPC 
operating environment architecture and the specific exception model implemented 
intheMPC601. 

• Chapter 6, "Memory Management Unit," provides descriptions of operation of the 
MMU, interaction with other units, and address translation. Although this chapter 
does not provide an in-depth description of both the 64-bit and 32-bit memory 
management model defined by the PowerPC operating environment architecture, it 
does note differences between the defined 32-bit PowerPC definition and the 
MPC601 memory management implementation. 

• Chapter 7, "Instruction Timing," provides information about latencies, interblocks, 
special situations, and various conditions to help make programming more efficient. 
This chapter is of special interest to software engineers and system designers. 
Because each PowerPC implementation is unique with respect to instruction timing, 
this chapter primarily contains information specific to the MPC601. 
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• Chapter 8, "Signal Descriptions," provides descriptions of individual signals of the 
MPC601. 

• Chapter 9, "System Interface Operation," describes signal timings for various 
operations. It also provides information for interfacing to the MPC6()1. 

• Chapter 10, "Instruction Set," functions as a handbook of the PowerPC instruction 
set. It provides opcodes, sorted by mnemonic, as well as a more detailed description 
of each instruction. Instruction descriptions indicate whether an instruction is part of 
the PowerPC base architecture or if it is specific to the MPC601. Each description 
indicates any differences in how the MPC601 implementation differs from the 
PowerPC implementation. The descriptions also indicate the privilege level of each 
instruction and which execution unit or units executes the instruction. 

• Appendix A, "Instruction Set Listings," lists the superset of PowerPC and MPC6()1 
instructions. 

• Appendix B, "POWER Architecture Cross Reference," describes the relationship 
between the MPC601 and the POWER architecture. 

• Appendix C, "PowerPC Instructions Not Implemented in MPC6()1," describes the, 
set of PowerPC instructions not implemented in the MPC601 processor. 

• Appendix D, "Classes of Instructions," describes how instructions are classified 
from the perspective of the PowerPC architecture. 

• Appendix E, "Multiple-Precision Shifts," describes how multiple-precision shifts 
can be programmed. 

• Appendix F, "Floating-Point Models," gives examples of how the floating-point 
conversion instructions can be used to perform various conversions. 

• Appendix G, "Synchronization Programming Examples," gives examples showing 
how synchronization instructions can be used to emulate various synchronization 
primitives and how to provide more complex forms of synchronization. 

• This manual also includes a glossary and an index. 



Suggested Reading 



This section lists additional reading that provides background to the information in this 
manual. 

• John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative 
Approach, Morgan Kaufmann Publishers, Inc., San Mateo, CA 

Conventions 

This document uses the following notational conventions: 

ACTIVE_HIGH Names for signals that are active high are shown in uppercase text 
without an overbar. 

ACTlVE_LOW A bar over a signal name indicates that the signal is active low — for 
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example, ARTRY (address retry) and TS' (transfer start). Active-low 
signals are referred to as asserted (active) v^hen they are low and 
negated when they are high. Signals that are not active-low, such as 
AP0-AP3 (address bus parity signals) and TT()-TT4 (transfer type 
signals) are referred to as asserted when they are high and negated 
when they are low. 

sync Courier monospaced type indicates code examples. 

mnemonics Instruction mnemonics are shown in lowercase bold. 

italics Italics indicate variable command parameters, for example, bcctrjc 

x'OF' Hexadecimal numbers 

b'OOir Binary numbers 

rAlO The contents of a specified GPR or the value 0. 

REGIFIELD] Abbreviations or acronyms for registers are shown in uppercase 

text. Specific bit fields or ranges are shown in brackets. 

X In certain contexts, such as a signal encoding, this indicates a don't 

care. For example, if TT()-TT3 are binary encoded b'xOOl', the 
state of TTO is a don't care. 



Acronyms and Abbreviations 

The Table i contains acronyms and abbreviations that are used in this document: 
Table i. Acronyms and Abbreviated Terms 



Term 


Meaning 


Term 


Meaning 


ALU 


Arithmetic logic unit 


DABR 


Data address breakpoint register 


ASR 


Address space register 


DAE 


Data access exception 


BAT 


Block address translation 


DAR 


Data address register 


BIST 


Built-in self test 


DBAT 


Data BAT 


BPU 


Branch processing unit 


DEC 


Decrementer register 


BTLB 


Block translation look-aside buffer 


DSISR 


DAE/source instruction service 
register 


BUG 


Bus unit controller 


EA 


Effective address 


CAR 


Cache address register 


EAR 


External access register 


CMOS 


Complementary metal-oxide 
semiconductor 


ECC 


Error checking and correction 


COP 


Common on-chip processor 


FPECR 


Floating-point exception cause 
register 


CR 


Condition register 


FPR 


Floating-point register 


CTR 


Count register 


FPSCR 


Floating-point status and control 
register 
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Table i. Acronyms and Abbreviated Terms (Continued) 



Term 


Meaning 


Term 


Meaning 


FPU 


Floating-point unit 


POWER 


Performance Optimized with 
Enhanced RISC 


GPR 


General-purpose register 


PR 


Privilege level bit 


lABR 


Instruction address breakpoint 
register 


PTE 


Page table entry 


IBAT 


Instruction BAT 


PTEG 


Page table entry group 


IEEE 


Institute for Electrical and 
Electronics Engineers 


PVR 


Processor version register 


IQ 


Instruction queue 


RISC 


Reduced instruction set computer 


ITLB 


Instruction translation look-aside 
buffer 


RTC 


Real-time clock 


lU 


Integer unit 


RTCL 


Real-time clock lower register 


L2 


Secondary cache 


RTCU 


Real-time clock upper register 


LR 


Link register 


RTL 


Register transfer level 


LRU 


Least recently used 


RWITM 


Read with intent to modify 


LSB 


Least-significant byte 


SDR1 


Table search descriptor register 1 


Isb 


Least-significant bit 


SLB 


Segment look-aside buffer 


MDR 


Memory descriptor register 


SPR 


Special-purpose register 


MESI 


Modified/exclusive/sfiared/invalid — 
cache coherency protocol 


SPRGn 


General SPR 


MMU 


Memory management unit 


SR 


Segment register 


MQ 


MQ register 


SRRO 


Machine status save/restore 
register 


MSB 


Most-significant byte 


SRR1 


Machine status save/restore 
register 1 


msb 


Most-significant bit 


TB 


Time base register 


MSR 


Machine state register 


TLB 


Translation lookaside buffer. 


NaN 


Not a number 


TTL 


Transistor-to-transistor logic 


no-op 


No operation 


UTLB 


Unified translation look-aside buffer 


FID 


Processor identification tag 


WIM 


Write-through/cache- 
inhibited/memory-cofierency 
enforced bits 


PIR 


Processor identification register 


XER 


Integer exception register 
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About This Book 



Differences between IBM and Motorola Terminology 

Table ii describes terminology conventions used in this manual, noting in particular the 
differences between IBM and Motorola usages. 

Table ii. Differences between IBIVI and l\/lotorola Terminology 



IBM 


Motorola 


Interrupt 


Exception 


Programmable I/O (PIO) 


I/O controller interface 
operation 


Relocation 


Translation 


Storage 


Memory 


Store In 


Write back 


Store through 


Write through 
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Chapter 1 
Overview 



This chapter provides an overview of the MPC601 features, including a block diagram 
showing the major functional components. It also provides an overview of the PowerPC 
architecture and hardware design conventions adapted for current and forthcoming 
PowerPC processors, and information about how the MPC601 implementation differs or 
augments these architectural and hardware definitions. 

1.1 MPC601 Overview 

This section describes the features of the MPC601, provides a block diagram showing the 
major functional units, and gives an overview of how the MPC601 operates. 

The MPC601 is the first implementation of the PowerPC™ family of reduced instruction 
set computer (RISC) microprocessors. The MPC601 is a 32-bit implementation of the 
64-bit PowerPC architecture. It is a superscalar processor capable of issuing and retiring 
three instructions per clock, one to each of three execution units. Instructions can complete 
out of order for increased performance; however, the MPC601 makes execution appear 
sequential. 

The MPC601 integrates three execution units — an integer unit (lU), a branch processing 
unit (BPU), and a floating-point unit (FPU). The ability to execute three instructions in 
parallel and the use of simple instructions with rapid execution times yield high efficiency 
and throughput for MPC601 -based systems. Most integer instructions execute in one clock 
cycle. The FPU is pipelined so a single-precision multiply-add instruction can be issued 
every clock cycle. 

The MPC601 includes an on-chip, 32-Kbyte, eight-way set-associative, physically 
addressed, unified instruction and data cache and an on-chip memory management unit 
(MMU). The MMU contains a 256-entry, two-way set-associative, unified translation look- 
aside buffer (UTLB) and provides support for demand paged virtual memory address 
translation and variable-sized block translation. Both the UTLB and the cache use least 
recentiy used (LRU) replacement algorithms. 

The MPC601 has a high-bandwidth, 64-bit data bus and a 32-bit address bus. The MPC6()1 
interface protocol allows multiple masters to compete for system resources through a 
central external arbiter. Additionally, on-chip snooping logic maintains cache coherency in 
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multiprocessor applications. The MPC601 supports single-beat and burst data transfers for 
memory accesses; it also supports both memory-mapped I/O and I/O controller interface 
addressing. 

The MPC6()1 uses an advanced, 3.6-V CMOS process technology and maintains full 
interface compatibility with TTL devices. 

1.1.1 MPC601 Features 

This section describes features specific to the MPC601. Note that general characteristics of 
the PowerPC architecture and hardware conventions among the family of PowerPC 
processors are listed in Section 1.3.1, "Features." Major features of the MPC601 are as 
follows: 

• High-performance, superscalar microprocessor 

— As many as three instructions in execution per clock (one to each of the three 
execution units) 

— Single clock cycle execution for most instructions 

— Pipelined FPU for all single-precision and most double-precision operations 

• Three independent execution units and two register files 

— BPU featuring static branch prediction 

— A 32-bit lU 

— Fully IEEE 754-compliant FPU for both single- and double-precision operations 

— Thirty-two GPRs for integer operands 

— Thu-ty-two FPRs for single- or double-precision operands 

• High instruction and data throughput 

— Condition register (CR) look-ahead operations performed by BPU 

— Zero-cycle branch capability 

— Programmable static branch prediction on unresolved conditional branches 

— Instruction unit capable of prefetching eight instructions per clock from the 
cache 

— A prefetch queue that can hold as many as eight instructions that provides look- 
ahead capability 

— Interlocked pipelines with feed-forwarding that control data dependencies in 
hardware 

— Unified 32-Kbyte cache — eight-way set-associative, physically addressed; LRU 
replacement algorithm 

— Cache write-back or write-through operation programmable on a per page or per 
block basis 

— Memory unit with a two-element read queue and a three-element write queue 
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— Run-time reordering of loads and stores 

— BPU that performs condition register (CR) look-ahead operations 

— Programmable static branch prediction on unresolved conditional branches 

— Address translation facilities for 4-Kbyte page size, variable block size, and 
256-Mbyte segment size 

— A256-entry, two-way set-associative UTLB 

— Four-entry, first-level ITLB 

— Hardware table search (caused by UTLB misses) through hashed page tables 

— 52-bit virtual address; 32-bit physical address 

— Four-entry BTLB providing 128-Kbyte to 8-Mbyte blocks 
• Facilities for enhanced system performance 

— Bus speed defined as selectable division of operating frequency 

— A 64-bit split-transaction external data bus with burst transfers 

— Support for address pipelining and limited out-of-order bus transactions 

— Snooped copyback queues for cache block (sector) copyback operations 

— Bus extensions for I/O controller interface operations 

— Multiprocessing support features that include the following: 

- Hardware enforced, four-state cache coherency protocol (MESI) 

- Separate port into cache tags for bus snooping 

1.1.2 Block Diagram 

Figure 1-1 provides a block diagram of the MPC601 that illustrates how the execution 
units — lU, FPU, and BPU — operate independently and in parallel. 

The MPC60rs 32-Kbyte, unified cache tag directory has a port dedicated to snooping bus 
transactions, preventing interference with processor access to the cache. The MPC601 also 
provides address translation and protection facilities, including a UTLB and a BTLB, and 
a four-entry ITLB that contains the four most recently used instruction address translations 
for fast access by the instruction unit. 

Instruction prefetching and issuing is handled in the instruction unit. Translation of 
addresses for cache or external memory accesses are handled by the memory management 
unit. Both units are discussed in more detail in Sections 1 . 1 .3, "Instruction Unit," and 1 . 1 .5, 
"Memory Management Unit (MMU)." 
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1.1.3 Instruction Unit 

As shown in Figure 1-1, the MPC601 instruction unit, which contains an instruction queue 
and the BPU, provides centralized control of instruction flow to the execution units. The 
instruction unit determines the address of the next instruction to be fetched based on 
information from a sequential fetcher and the BPU. The instruction unit also enforces 
pipeline interlocks and controls feed-forwarding. 

The sequential fetcher is a dedicated adder that computes the address of the next sequential 
instruction based on the address of the last fetch and the number of words accepted into the 
queue. The BPU searches the bottom half of the instruction queue for a branch instruction 
and uses static branch prediction on unresolved conditional branches to allow the 
instruction fetch unit to prefetch instructions from a predicted target instruction stream 
while a conditional branch is evaluated. The BPU also folds out branch instructions for 
unconditional branches. 

Instructions issued beyond a predicted branch do not complete execution until the branch 
is resolved, preserving the programming model of sequential execution. If any of these 
instructions are to be executed in the BPU, they are decoded but not issued to the BPU. If 
any of these are FPU and lU instructions, they are issued and allowed to complete up to the 
register write-back stage, but no write back is performed. 

When a correctly predicted branch is resolved, instruction execution continues without 
interruption along the predicted path. If branch prediction is incorrect, the instruction 
fetcher flushes all instructions from the instruction queue. Instruction issue then resumes 
with the instruction from the correct path. Instructions are never issued to the lU or FPU 
unless they must be executed by the program. 

1.1.3.1 instruction Queue 

The instruction queue, shown in Figure 1-2, contains instructions prefetched from the 
current instruction stream. 

The instruction unit prefetches instructions from the cache into the instruction queue. As 
many as eight instructions (a cache sector) can be loaded into the instruction queue during 
any cycle. Instructions move from the top of the queue (Q7) towards the bottom of the 
queue (QO) and a full range of shift amounts through the queue is supported. 

The upper half of the instruction queue (Q4-Q7) provides buffering to reduce the need to 
access the cache. Some initial decoding of instructions is performed in the lower half (QO 
through Q3) of the queue. QO functions as the initial decode stage for the lU. 

As instructions issue to the BPU and FPU, new instructions are loaded into the queue. 
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1.1.4 Independent Execution Units 

One benefit of the PowerPC architecture is its support for independent floating-point, 
integer, and branch processing execution units, making it possible to implement advanced 
features such as look-ahead operations and out-of-order instruction dispatches. For 
example, since branch instructions do not depend on GPRs or FPRs, branches can often be 
resolved early, eliminating stalls caused by taken branches. Additionally, upon resolution 
of the branch, the branch instruction is removed from the pipeline and fetching continues 
from the first instruction in the target stream. This procedure is called branch folding. 

The following sections describe the MPC60rs three execution units — the BPU, lU, and 
FPU. 

1.1.4.1 Branch Processing Unit (BPU) 

The BPU performs condition register (CR) look-ahead operations on conditional branches. 
The BPU looks through the bottom half of the instruction queue for a conditional branch 
instruction and attempts to resolve it early, achieving the effect of a zero-cycle branch in 
many cases. 

The BPU uses a bit in the instruction encoding to predict the direction of the conditional 
branch. Therefore when an unresolved conditional branch instruction is encountered, the 
MPC601 prefetches instructions from the predicted target streain until the conditional 
branch is resolved. 

The BPU contains an adder to compute branch target addresses and three special-purpose, 
user-control registers — the LR, the CTR, and the CR. The BPU calculates the return pointer 
for subroutine calls and saves it into the LR. The LR also contains the branch target address 
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for the Branch Conditional to Link Register (bclrjc) instruction. The CTR contains the 
branch target address for the Branch Conditional to Count Register (bcctrj:) instruction. 
The contents of the LR and CTR can be copied to or from any GPR. Because the BPU uses 
dedicated registers rather than general-purpose or floating-point registers, execution of 
branch instructions is independent from execution of integer and floating-point 
instructions. 

1.1.4.2 Integer Unit (lU) 

The lU executes all integer and memory access instructions (including those required for 
floating-point registers). The lU contains an arithmetic logic unit (ALU), a multiplier, a 
divider, the integer exception register (XER), and the general-purpose register file. One 
instruction can be issued to the lU each clock cycle. 

The lU interfaces with the cache and MMU for all instructions that access memory. 
Addresses are formed by adding the source 1 register operand specified by the instruction 
(or zero) to either a source 2 register operand or to a 16-bit, immediate value embedded in 
the instruction. 

Load and store instructions are issued and translated in program order; however, the 
accesses can occur out of order. These accesses can be strictly ordered through the use of 
synchronizing instructions. 

Load and store instructions are considered to have completed execution after the address is 
translated. If the address for a load or store instruction hits in the UTLB or BTLB and it is 
aligned, the instruction execution takes one clock cycle, allowing back-to-back issue of 
load and store instructions. 

1.1.4.3 Floating-Point Unit (FPU) 

The FPU contains a single-precision multiply-add array, a divider, the floating-point status 
and control register (FPSCR), and the FPRs. The multiply-add array allows the MPC601 to 
efficiently implement floating-point operations such as multiply, add, and multiply-add. 
The FPU is pipelined so that most single-precision instructions and many double-precision 
instructions can be issued back-to-back. The FPU contains two additional instruction 
queues. These queues allow floating-point insti'uctions to be issued from the instruction 
queue even if the FPU is busy, making instructions available for issue to the other execution 
units. 

Like the BPU, the FPU can access instructions from the bottom half of the instruction queue 
(Q3-Q0), which permits floating-point instructions that do not depend on unexecuted 
instructions to be issued early to the FPU, thus maximizing efficiency and reducing 
bottlenecks in the instruction pipeline. 

All IEEE 754 floating-point data types (normalized, denormalized, NaN, zero, and infinity) 
are supported in hardware on the MPC6()1, which eliminates the latency incurred by 
software exception routines to support all data types. 
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1.1.5 Memory Management Unit (MMU) 

The MPC60rs MMU supports up to 4 Terabytes (2^2) of virtual memory and 4 Gigabytes 
(2^2) of physical memory. The MMU also controls access privileges for these spaces on 
block and page granularities. Referenced and changed status are maintained by the 
processor for each page to assist implementation of a demand-paged virtual memory 
system. 

The instruction unit generates all instruction addresses; these addresses are both for 
sequential instruction prefetches and addresses that correspond to a change of program 
flow. The integer unit generates addresses for data accesses (both for memory and the I/O 
controller interface). 

After an address is generated, the upper order bits of the logical address are translated by 
the MMU into physical address bits. Simultaneously, the lower order address bits (that are 
untranslated and therefore considered both logical and physical), are directed to the on-chip 
cache where they form the index into the eight-way set-associative tag array. After 
translating the address, the MMU passes the higher-order bits of the physical address to the 
cache, and the cache lookup completes. For cache-inhibited accesses or accesses that miss 
in the cache, the untranslated lower order address bits are concatenated with the translated 
higher-order address bits; the resulting 32-bit physical address is then used by the memory 
unit and the system interface, which accesses external memory. 

The MMU also directs the address translation and enforces the protection hierarchy 
programmed by the operating system in relation to the supervisor/user privilege level of the 
access and in relation to whether the access is a load or store. 

For instruction accesses, the MMU first performs a lookup in the four entries of the ITLB 
for the physical address translation. Instruction accesses that miss in the ITLB and all data 
accesses cause a lookup in the UTLB and BTLB for the physical address translation. In 
most cases, the physical address translation resides in one of the TLBs and the physical 
address bits are readily available to the on-chip cache. In the case where the physical 
address translation misses in the TLBs, the MPC601 automatically performs a search of the 
translation tables in memory using the information in the SDRl and the corresponding 
segment register. 

Memory management in the MPC6()1 is described in more detail in Section 1.3.6.2, 
"MPC601 Memory Management." 

1.1.6 Cache Unit 

The MPC601 contains a 32-Kbyte, eight-way set associative, unified (instruction and data) 
cache. The cache line size is 64 bytes, divided into two eight-word sectors, each of which 
can be snooped, loaded, cast-out, or invalidated independently. The cache is designed to 
adhere to a write-back policy, but the MPC601 allows control of cacheability, write policy, 
and memory coherency at the page and block level. The cache uses a least recently used 
(LRU) replacement policy. 
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The cache provides an eight-word interface to the rest of the device. The surrounding logic 
selects, organizes, and forwards the requested information to the requesting unit. Write 
operations to the cache can be performed on a byte basis, and a complete read-modify-write 
operation to the cache can occur in each cycle. 

The instruction unit provides the cache with the address of the next instruction to be 
prefetched. In the case of a cache hit, the cache returns the instruction and as many of the 
instructions following it as can be placed in the eight-word instruction queue up to the 
cache sector boundary. If the queue is empty, as many as eight words (an entire sector) can 
be loaded into the queue in parallel. 

The cache has one address port dedicated to instruction fetch and load/store accesses and 
one dedicated to snooping transactions on the system interface. Therefore, snooping does 
not require additional clock cycles unless a snoop hit that requires a cache status update 
occurs. 

1.1.7 Memory Unit 

The MPC601 's memory unit contains read and write queues that buffer operations between 
the external interface and the cache. These operations are comprised of operations resulting 
from load and store instructions that are cache misses and read and write operations 
required to maintain cache coherency, and table search operations. As shown in Figure 1-3, 
the read queue contains two elements and the write queue contains three elements. Each 
element of the write queue can contain as many as eight words (one sector) of data. One 
element of the write queue, marked snoop in Figure 1-3, is dedicated to writing cache 
sectors to system memory after a modified sector is hit by a snoop from another processor 
or snooping device on the system bus. The use of this queue guarantees a high priority 
operation that ensures a deterministic response time when snooping hits a modified sector. 
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Figure 1-3. l\/lemory Unit 

The other two elements in the write queue are used for store operations and writing back 
modified sectors that have been deallocated by updating the queue; that is, when a cache 
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location is full, the least-recently used cache sector is deallocated by first being copied into 
the write queue and from there to system memory. Note that snooping can occur after a 
sector has been pushed out into the write queue and before the data has been written to 
system memory. Therefore, to maintain a coherent memory, the write queue elements are 
compared to snooped addresses in the same way as the cache tags. If a snoop hits a write 
queue element, the data is first stored in system memory before it can be loaded into the 
cache of the snooping bus master. Full coherency checking between the cache and the write 
queue prevents dependency conflicts. 

Execution of a load or store instruction is considered complete when the associated address 
translation completes, guaranteeing that the instruction has completed to the point where it 
is known that it will not generate an internal exception. However, after address translation 
is complete, a read or write operation can still generate an external exception. 

Load and store instructions are always issued and translated in program order with respect 
to other load and store instructions. However, a load or store operation that hits in the cache 
can complete ahead of those that miss in the cache. The MPC601 ensures memory 
consistency by comparing target addresses and prohibiting instructions from completing 
out of order if an address matches. Load and store operations can be forced to execute in 
strict program order by using the synchronization instructions. 

1 .1 .8 System Interface 

Because the cache on the MPC601 is an on-chip, write-back primary cache, the 
predominant type of transaction for most applications is burst-read memory operations, 
followed by burst-write memory operations, I/O controller interface operations, and single- 
beat (noncacheable or write-through) memory read and write operations. Additionally, 
there can be address-only operations, variants of the burst and single-beat operations 
(global memory operations that are snooped, and atomic memory operations, for example), 
and address retry activity (for example, when a snooped read access hits a modified line in 
the cache). 

Memory accesses can occur in single-beat and four-beat burst data transfers. The address 
and data buses are independent for memory accesses to support pipelining and split 
transactions. The MPC601 can pipeline as many as two transactions and has limited support 
for out-of-order split-bus transactions. 

Memory is accessed through an arbitration mechanism that allows devices to compete for 
bus mastership. This arbitration mechanism is flexible, allowing the MPC601 to be 
integrated into systems that implement various fairness and bus-parking procedures to 
avoid arbitration overhead. Additional multiprocessor support is provided through 
coherency mechanisms that provide snooping, external control of the on-chip cache and 
TLB, and support for a secondary cache. Multiprocessor software support is provided 
through the use of atomic memory operations. 
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Typically, memory accesses are weakly ordered — sequences of operations, including 
load/store string and multiple instructions, do not necessarily complete in the order they 
begin — maximizing the efficiency of the bus without sacrificing coherency of the data. The 
MPC601 allows read operations to precede store operations (except when a dependency 
exists, of course). In addition, the MPC601 may reorder high priority store operations 
ahead of lower priority store operations. Because the processor can dynamically optimize 
run-time ordering of load/store traffic, overall performance is improved. 

1.2 Levels of the PowerPC Architecture 

The PowerPC architecture consists of the following layers, and adherence to the PowerPC 
architecture can be measured in terms of which of the following levels of the architecture 
is implemented: 

• PowerPC user instruction set architecture — ^This definition includes the base user- 
level instruction set (excluding a few user-level memory-control instructions), user- 
level registers, programming model, data types, and addressing modes. 

Aspects of the PowerPC user instruction set architecture are discussed in Chapter 2, 
"Registers and Data Types," Chapter 3, "Addressing Modes and Instruction Set 
Summary," and Chapter 10, "Instruction Set." 

• PowerPC virtual environment architecture — PowerPC virtual environment 
architecture — ^This describes the semantics of the memory model that can be 
assumed by software processes and includes descriptions of the cache model, cache- 
control instructions, address aliasing, and other related issues. Implementations that 
conform to the PowerPC virtual environment architecture also adhere to the 
PowerPC user instruction set architecture, but may not necessarily adhere to the 
PowerPC operating environment architecture. 

Aspects of the PowerPC virtual environment architecture are discussed in 
Chapter 2, "Registers and Data Types," Chapter 3, "Addressing Modes and 
Instruction Set Summary," Chapter 5, "Exceptions," and Chapter 10, "Instruction 
Set." 

• PowerPC operating environment architecture — This includes the structure of the 
memory management model, supervisor-level registers, and the exception model. 
Implementations that conform to the PowerPC operating environment architecture 
also adhere to the PowerPC user instruction set architecture and the PowerPC virtual 
environment architecture definition. 

Aspects of the PowerPC operating environment architecture are discussed in 
Chapter 2, "Registers and Data Types," Chapter 3, "Addressing Modes and 
Instruction Set Summary," Chapter 5, "Exceptions," Chapter 6, "Memory 
Management Unit," and Chapter 10, "Instruction Set." 

Note that while the MPC601 is said to adhere to the PowerPC architecture at all three 
levels, it diverges in aspects of its implementation to a greater extent than can be expected 
of subsequent PowerPC processors. Many of the differences result from the fact that the 
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MPC601 design is pivotal, providing compatibility with an existing architecture standard 
(POWER), while providing a reliable platform for hardware and software development 
compatible with subsequent PowerPC processors. 

The PowerPC architecture allows a wide range of designs for such features as cache and 
system interface implementations. 

1.3 MPC601 as a PowerPC Implementation 

The PowerPC architecture is derived from the IBM Performance Optimized with Enhanced 
RISC (POWER) architecture. The PowerPC architecture shares the benefits of the POWER 
architecture optimized for single-chip implementations. The architecture design facilitates 
parallel instruction execution and is scalable to take advantage of future technological 
gains. For compatibility, the MPC601 also implements instructions from the POWER user 
programming model that are not part of the PowerPC definition. 

This section describes the PowerPC architecture in general, noting where the MPC601 
differs. The organization of this section follows the sequence of the chapters in this manual 
as follows: 

• Features — ^This section describes general features that the MPC601 shares with the 
PowerPC family of microprocessors. It does not list PowerPC features not 
implemented in the MPC601. 

• Registers and programming model — ^This section describes the architected registers 
for the operating environment architecture common among PowerPC processors 
and describes the programming model. It also describes differences in how the 
architected registers are used in the MPC601 and describes the additional registers 
that are unique to the MPC601 ; 

• Instruction set and addressing modes — ^This section describes the PowerPC 
instruction set and addressing modes for the PowerPC operating environment 
architecture, and it generally defines the subset of the instruction set implemented in 
the MPC601 as well as additional instructions implemented in the MPC601 but not 
defined in the PowerPC architecture. 

• Cache implementation — ^This section describes the cache model that is defined 
generally for PowerPC processors by the virtual environment architecture. It also 
provides specific details about the MPC601 cache implementation. 

• Exception model — This section describes the exception model of the PowerPC 
operating environment architecture and the differences in the MPC601 exception 
model. 

• Memory management — ^This section describes generally the conventions for 
memory management among the PowerPC processors. Note that the PowerPC 
operating environment architecture defines different memory management designs 
for 64- and 32-bit implementations. This section also describes the general 
differences between the MPC601 and the 32-bit PowerPC memory management 
specification. 



1-12 PowerPC 601 RISC Microprocessor User's Manual MOTOROLA 



• Instruction timing — ^This section provides a general description of the instruction 
timing provided by the superscalar, parallel execution supported by the PowerPC 
architecture. 

• System interface — This section describes the signals implemented on the MPC6()1 . 

1.3.1 Features 

The MPC6()1 incorporates the following features of the PowerPC architecture: 

• High-performance, superscalar microprocessor implementations 

The PowerPC architecture allows optimizing compilers to schedule instructions to 
maximize performance through efficient use of the PowerPC instruction set and 
register model. The multiple, independent execution units allow compilers to 
maximize parallelism and instruction throughput. Compilers that take advantage of 
the flexibility of the PowerPC architecture can additionally optimize system 
performance of the PowerPC processors. 

The PowerPC architecture supports the following: 

— Multiple, independent execution units 

— Single clock cycle execution for most instructions 

— Fully IEEE 754-compliant FPU for both single- and double-precision operations 

— Thirty-two general-purpose registers (GPRs) for integer operands 

— Thirty-two floating-point registers (FPRs) for single- or double-precision 
operands 

• High instruction and data throughput 

— Cache write-back or write-through operation programmable on a per-page or 
per-block basis 

— Run-time reordering of loads and stores 

• Facilities for enhanced system performance 

— Programmable big- and little-endian byte ordering 

— Interprocessor UTLB invalidation 

— Interprocessor cache control operations 

— Atomic memory references 

• In-system testability and debugging features through boundary-scan capability 

Specific features of the MPC601 are listed in Section 1.1.1, "MPC601 Features." 

1.3.2 Registers and Programming l\/lodel 

The following subsections describe the general features of the PowerPC registers and 
programming model and of the specific MPC601 implementation, respectively. 
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1.3.2.1 PowerPC Registers and Programming Model 

The PowerPC architecture defines register-to-register operations for most computational 
instructions. Source operands for these instructions are accessed from the on-chip registers 
or are provided as immediate values embedded in the instruction opcode. The three-register 
instruction format allows specification of a different target register from the two source 
registers. Data is transferred between memory and registers with explicit load and store 
instructions only. 

PowerPC processors have two levels of privilege — supervisor mode of operation (typically 
used by the operating environment) and one that corresponds to the user mode of operation 
(used by the application software). The programming models incorporate 32 GPRs, 32 
FPRs, special-purpose registers (SPRs), and several miscellaneous registers. Note that 
there are several registers that are part of the PowerPC architecture that are not 
implemented in the MPC601; for example, the time base registers are not implemented in 
the MPC601 and the address space register (ASR) is implemented only in 64-bit 
implementations. Likewise, each PowerPC implementation has its own unique set of 
hardware implementation (HID) registers, which are implementation-specific. 

This division allows the operating system to control the application environment 
(providing virtual memory and protecting operating-system and critical machine 
resources). Instructions that control the state of the processor, the address translation 
mechanism, and supervisor registers can be executed only when the processor is operating 
in supervisor mode. 

The following sections summarize the PowerPC registers that are implemented in the 
MPC601 processor. Chapter 2, "Register Models and Data Types," provides detailed 
information about the registers implemented in the MPC601. 

1.3.2.1.1 General-Purpose Registers (GPRs) 

The PowerPC architecture defines 32 user-level, general-purpose registers (GPRs). These 
registers are either 32 or 64 bits wide depending on the implementation. The GPRs serve 
as the data source or destination for all integer instructions and provide addresses for all 
memory-access instructions. 

1.3.2.1.2 Floating-Point Registers (FPRs) 

The PowerPC architecture also defines 32 user-level 64-bit floating-point registers (FPRs) 
for both 32- and 64-bit PowerPC implementations. The FPRs serve as the data source or 
destination for floating-point instructions. These registers can contain data objects of either 
single- or double-precision floating-point formats. The floating-point register file can only 
be accessed by the FPU. 

1.3.2.1.3 Condition Register (CR) 

The CR is a 32-bit user-level register that consists of eight, four-bit fields that reflect the 
results of certain operations, such as move, integer and floating-point compare, arithmetic, 
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and logical instructions, and provide a mechanism for testing and branching. The CR is 32 
bits wide in all implementations. 

1.3.2.1.4 Floating-Point Status and Control Register (FPSCR) 

The floating-point status and control register (FPSCR) is a user-level register that contains 
all exception signal bits, exception summary bits, exception enable bits, and rounding 
control bits needed for compliance with the IEEE 754 standard. The FPSCR is 32 bits wide 
in all implementations. 

1 .3.2.1 .5 Machine State Register (l\/ISR) 

The machine state register (MSR) is a supervisor-level register that defines the state of the 
processor. The contents of this register is saved when an exception is taken and restored 
when the exception handling completes. The MPC601 implements the MSR as a 32-bit 
register; other PowerPC processors implement it as a 64-bit register. 

1.3.2.1.6 Segment Registers (SRs) 

The sixteen 32-bit segment registers (SRs) are present only in 32-bit PowerPC 
implementations. Figure 2-12 shows the format of a segment register when the T bit is 
cleared and Figure 2-13 shows the layout when the T bit is set. The fields in the segment 
register are interpreted differently depending on the value of bit 0. Note that 64-bit 
PowerPC implementations use a segment table rather than the segment registers for 
segment information. 

1.3.2.1.7 Special-Purpose Registers (SPRs) 

The PowerPC operating environment architecture defines numerous special -purpose 
registers that serve a variety of functions, such as providing controls, indicating status, 
configuring the processor, and performing special operations. Some SPRs are accessed 
implicitly as part of executing certain instructions. All SPRs can be accessed by using the 
Move to/from Special Purpose Register instructions, mtspr and mfspr. 

In the MPC601, all SPRs are 32 bits wide. 

1.3.2.1.8 User-Level SPRs 

The following MPC601 SPRs are accessible by user-level software: 

• Link register (LR) — ^The link register can be used to provide the branch target 
address and to hold the return address after branch and link instructions. The LR is 
64 bits wide in 64-bit implementations and 32 bits wide in 32-bit implementations. 

• Count register (CTR) — ^The CTR is decremented and tested automatically as a result 
of branch-and-count instructions. The CTR is 64 bits wide in 64-bit 
implementations and 32 bits wide in 32-bit implementations. 

• The integer exception register (XER) contains the integer carry and overflow bits 
and two fields for the Load String and Compare Byte Indexed (Iscbx) instruction. 
The XER is 32 bits wide in all implementations. 
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Note that while these registers are defined as SPRs and can be accessed by using the mtspr 
and mfspr instructions, these registers are typically accessed implicitly. In addition, the 
PowerPC architecture defines a 64-bit time base register (TB), which replaces the real-time 
clock implementation on the MPC601 . 

1.3.2.1.9 Supervisor-Level SPRs 

The MPC601 also contains SPRs that can be accessed only by supervisor-level software. 
These registers consist of the following: 

• The 32-bit data access exception (DAE)/source instruction service register (DSISR) 
defines the cause of data access and alignment exceptions. 

• The data address register (DAR) is a 32-bit register that holds the address of an 
access after an alignment or data access exception. 

• Decrementer register (DEC) is a 32-bit decrementing counter that provides a 
mechanism for causing a decrementer exception after a prograinmable delay. 
PowerPC architecture defines that the DEC frequency be provided as a subdivision 
of the processor clock frequency; however, the MPC601 implements a separate RTC 
which also serves the DEC. 

The 32-bit table search description register 1 (SDRl) specifies the page table format 
used in logical-to-physical address translation for pages. 

The machine status save/restore register (SRRO) is a 32-bit register that is used by 
the MPC601 for saving the address of the instruction that caused the exception, and 
the address to return to when a Return from Interrupt (rfi) instruction is executed. 

The machine status save/restore register 1 (SRRl) is a 32-bit register used to save 
machine status on exceptions and to restore machine status when an rfi instruction 
is executed. 

General SPRs, SPRG()-SPRG3, are 32-bit registers provided for operating system 
use. 

The external access register (EAR) is a 32-bit register that controls access to the 
external control facility through the External Control Input Word Indexed (eciwx) 
and External Control Output Word Indexed (ecowx) instructions. 

The processor version register (PVR) is a 32-bit, read-only register that identifies the 
version (model) and revision level of the PowerPC processor. 

Block address translation (BAT) registers — The PowerPC architecture defines 16 
BAT registers, divided into four pairs of data B ATs (DBATs) and four pairs of 
instruction bats (IBATs). The MPC6()1 includes four pairs of unified BATs 
(BAT0U-BAT3U and BAT0L-BAT3L). See Figure 2-1 for a list of the SPR 
numbers for the BAT registers. Figure 2-23 and Figure 2-24 show the layout of the 
upper and lower BAT registers. Note that the format for the MPC6()1 's 
implementation of the BAT registers differs from the PowerPC architecture 
definition. 
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In addition, 64-bit PowerPC processors implement a supervisor-level, 64-bit address space 
register (ASR) that defines the physical address of the segment tables in memory. 

1.3.2.2 MPC601 Programming Model and Additional Registers 

The MPC601 includes the following registers that are not part of the PowerPC architecture. 

• Real-time clock (RTC) registers— RTCU and RTCL (RTC upper and RTC lower). 
The RTCU register maintains the number of seconds from a time specified by 
software. The RTCL register maintains a fraction of the current second in 
nanoseconds. The contents of either register can be copied to any GPR. These 
registers are specific to the MPC601. These registers are not supported in the 
PowerPC architecture, which uses the time base facility rather than a separate real- 
time clock. For more information, see Section 2.2.5.3, "Real-Time Clock (RTC) 
Registers." These registers are also implemented in the POWER architecture. 
PowerPC processors implement a time base based on the processor clock. 

• MQ register (MQ). The MQ register is a MPC601 -specific, 32-bit register used as a 
register extension to accommodate the product for the multiply instructions and the 
dividend for the divide instructions. It is also used as an operand of long rotate and 
shift instructions. This register is provided for compatibility with POWER 
architecture, and is not part of the PowerPC architecture. For more information, see 
Section 2.2.5.1, "MQ Register (MQ)." The MQ register is typically accessed 
implicitly as part of executing a computational instiiiction. This register is also 
implemented in the POWER architecture. 

• Block-address translation (BAT) registers. The MPC601 includes eight block- 
address translation registers (BATs), consisting of four pairs of BATs (BATOU- 
BAT3U and BAT0L-BAT3L). See Figure 2-1 for a list of the SPR numbers for the 
BAT registers. Figure 2-23 and Figure 2-24 show the formats of the upper and lower 
BAT registers. Note that other PowerPC implementations have two sets of four BAT 
pairs — ^four sets of upper and lower IB ATs (which occupy the space of the unified 
BATs in the MPC601) and four sets of upper and lower DBATs (located in the 
subsequent eight positions at SPR numbers 536-543). The PowerPC architecture 
defines twice as many BAT registers — ^four IBAT pairs and four DBAT pairs. 

• The hardware implementation registers, HID0-HID2, HID5, and HID15 are 
provided primarily for debugging. For more information, see Section 2.3.3.12.1, 
"Checkstop Sources and Enables Register — HIDO" through Section 2.3.3.12.5, 
"Processor Identification Register (PIR)— HID15." HID15 holds the four-bit 
processor identification tag (PID) that is useful for differentiating processors in 
multiprocessor system designs. For more information, see Section 2.3.3.12.5, 
"Processor Identification Register (PIR) — HID15." Note that while it is not 
guaranteed that the implementation of HID registers is consistent among PowerPC 
processors, other processors may be designed with similar or identical HID 
registers. 
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1.3.3 Instruction Set and Addressing l\/lodes 

The following subsections describe the PowerPC instruction set and addressing modes in 
general. Differences in the MPC60rs instruction set are described in Section 1,3.3.2, 
"MPC601 Instruction Set." 

1.3.3.1 PowerPC Instruction Set and Addressing l\/lodes 

All PowerPC instructions are encoded as single-word (32-bit) opcodes. Instruction formats 
are consistent among all instruction types, permitting efficient decoding to occur in parallel 
with operand accesses. This fixed instruction length and consistent format greatly 
simplifies instruction pipelining. In addition, each instruction is defined in a way that 
simplifies pipelined implementations and allows maximum realization of instruction-level 
parallelism. 

1.3.3.1.1 PowerPC Instruction Set 

The PowerPC instructions are divided into the following categories: 

• Integer instructions — These include computational and logical instructions. 

— Integer arithmetic instructions 

— Integer compare instructions 

— Integer logical instructions 

— Integer rotate and shift instructions 

• Floating-point instructions — ^These include floating-point computational 
instructions, as well as instructions that affect the floating-point status and control 
register (FPSCR). 

— Floating-point arithmetic instructions 

— Floating-point multiply/add instructions 

— Roating-point rounding and conversion instructions 

— Floating-point compare instructions 

— Floating-point status and control instructions 

• Load/store instructions — ^These include integer and floating-point load and store 
instructions. 

— Integer load and store instructions 

— Integer load and store multiple instructions 

— Floating-point load and store 

— Floating-point move instructions 

• Flow control instructions — These include branching instructions, condition register 
logical instructions, trap instructions, and other instructions that affect the 
instruction flow. 

— Branch and trap instructions 

— Condition register logical instructions 
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• Processor control instructions — Tiiese instructions are used for synchronizing 
memory accesses and management of caciies, UTLBs, and the segment registers. 

— Move to/from special purpose register instructions 

• Memory control instructions — These instructions provide control of caches, TLBs, 
and segment registers. 

— Supervisor-level cache management instructions 

— User-level cache instructions 

— Segment register manipulation instructions 

— Translation look-aside buffer management instructions 

Note that this grouping of the instructions does not indicate which execution unit executes 
a particular instruction or group of instructions. This information, which is useful in taking 
full advantage of superscalar parallel instruction execution, is provided in Chapter 7, 
"Instruction Timing," and Chapter 10, "Instruction Set." 

Integer instructions operate on byte, half-word, and word operands. Floating-point 
instructions operate on single-precision and double-precision floating-point operands. The 
PowerPC architecture uses instructions that are four bytes long and word-aligned. It 
provides for byte, half-word, and word operand loads and stores between memory and a set 
of 32 general-purpose registers (GPRs). It also provides for word and double-word operand 
loads and stores between memory and a set of 32 floating-point registers (FPRs). 

Computational instructions do not modify memory. To use a memory operand in a 
computation and then modify the same or another memory location, the memory contents 
must be loaded into a register, modified, and then written back to the target location with 
distinct instructions. 

PowerPC processors follow the program flow when they are in the normal execution state. 
However, the flow of instructions can be interrupted direcfly by the execution of an 
instruction or by an asynchronous event. Either kind of exception may cause one of several 
components of the system software to be invoked. 

A prograin references memory using the effective address computed by the processor when 
it executes a memory access or branch instruction, or when it fetches the next sequential 
instruction. 

1.3.3.1.2 Calculating Effective Addresses 

The effective address (EA) is the 32-bit address computed by the processor when executing 
a memory access or branch instruction or when fetching the next sequential instruction. 

The PowerPC architecture supports two simple memory addressing modes: 

• EA = (r AlO) + offset (including offset = 0) (register indirect with immediate index) 

• EA = (r AlO) + rB (register indirect with index) 
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These simple addressing modes allow efficient address generation for memory accesses. 
Calculation of the effective address for aligned transfers occurs in a single clock cycle. 

For a memory access instruction, if the sum of the effective address and the operand length 
exceeds the maximum effective address, the storage operand is considered to wrap around 
from the maximum effective address to effective address 0. 

Effective address computations for both data and instruction accesses use 32-bit unsigned 
binary arithmetic. A carry from bit is ignored. 

1.3.3.2 MPC601 Instruction Set 

The MPC6()1 instruction set is defined as follows. 

• The MPC601 implements the majority of the 32-bit instructions in the PowerPC 
architecture, and traps PowerPC instructions that it does not implement to the illegal 
instruction program exception handler for execution by a software envelope. These 
instructions are described in Appendix C, "PowerPC Instructions Not Implemented 
inMPC601." 

• The MPC601 supports a number of POWER instructions that are otherwise not 
implemented in the PowerPC architecture. These are Hsted in Appendix B, 
"POWER Architecture Cross Reference." Individual instructions are described in 
Chapter 10, "Instruction Set." 

• The MPC601 implements the External Control Input Word Indexed (eciwx) and 
External Control Output Word Indexed (ecowx) instructions, which are optional in 
the PowerPC architecture definition. 

• Several of the instructions implemented in the MPC6()1 function somewhat 
differently than they are defined in the PowerPC architecture. These differences 
typically stem from design differences; for instance, the PowerPC architecture 
defines several cache control instructions specific to separate instruction and data 
cache designs. 

When executed on the MPC601, such instructions may provide a subset of the 
functions of the architected instruction or they may be no-ops. 

For a list of all PowerPC instructions and all MPC601 -specific instructions, see 
Appendix A, "Instruction Set Listings." Chapter 10, "Instruction Set," describes each 
instruction, indicating whether an instruction is MPC601 -specific and describing any 
differences in the implementation on the MPC601 . 

1.3.4 Cache Implementation 

The following subsections describe the PowerPC cache implementation in general, and the 
MPC601 -specific implementation, respectively. 

1.3.4.1 PowerPC Cache Implementation 

PowerPC cache implementations are implementation-specific. For example, some 
PowerPC processors may have separate instruction and data caches (Harvard architecture), 
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while the MPC601 uses a unified cache. This causes PowerPC cache control instructions to 
work differently, but compatibly, when executed on the MPC601 — for example the icbi 
(Instruction Cache Block Invalidate) instruction is treated as a no-op when executed by the 
MPC6()1. 

PowerPC implementations can control the following memory access modes on a page or 
block basis. 

• Write-back/write-through mode 

• Cache-inhibited mode 

• Memory coherency 

To ensure coherency among caches in a multiprocessor (or multiple caching-device) 
implementations, PowerPC processors support the MESI protocol. MESI stands for 
modified/exclusive/shared/invalid. These four states indicate the state of the cache block as 
follows: 

• Modified — The cache block is modified with respect to system memory; that is, this 
cache block holds the only valid data for this address. 

• Exclusive — ^This cache block holds valid data that is identical to the data at this 
address in system memory. No other cache has this data. 

• Shared — This cache block holds valid data that is identical to this address in system 
memory and at least one other caching device. 

• Invalid — ^This cache block does not hold valid data. 

Note that in the MPC601 processor, a block is defined as an eight-word sector. The 
PowerPC virtual environment architecture also defines a set of cache control instructions. 

1.3.4.2 MPC601 Cache Implementation 

The MPC601 has a 32-Kbyte, eight-way set-associative unified (instruction and data) 
cache. The cache is physically addressed and can operate in either write-back or write- 
through mode. Either memory update policy can be selected on a per-page or per-block 
basis. 

The cache is configured as eight sets of 64 lines. Each line consists of two sectors, four state 
bits (two per sector), and an address tag. The two state bits implement the four-state MESI 
(modified-exclusive-shared-invalid) protocol. Each sector contains eight 32-bit words. 
Note that the PowerPC architecture defines the term block as the cacheable unit. For the 
MPC601 processor, the block is a sector. A block diagram of the cache organization is 
shown in Figure 1-4. 

Each cache line contains 16 contiguous words from memory that are loaded from a 
16- word boundary (that is, bits A26-A31 of the logical addresses are zero); thus, a cache 
line never crosses a page boundary. MisaUgned accesses across a page boundary can incur 
a performance penalty. 
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Cache operations are always performed on a sector basis (that is, the cache is snooped and 
updated and coherency is maintained on a per-sector basis). However, if the other sector in 
the line is marked invalid, an optional, low-priority update of that sector is attempted after 
the sector that contained the critical word is filled. This function can be disabled. An LRU 
algorithm is used to select the cache line. 

External bus transactions that load instructions or data into the cache always transfer the 
missed quad word first, regardless of its location in a cache sector; then the rest of the cache 
sector is filled. As the missed quad word is loaded into the cache, it is simultaneously 
forwarded to the appropriate execution unit so instruction execution resumes as quickly as 
possible. 

Cache coherency is enforced by on-chip hardware bus snooping logic. Since the cache tag 
directory has a separate port dedicated to snooping bus transactions, bus snooping traffic 
does not interfere with processor access to the cache unless a snoop hit occurs. 
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Figure 1-4. Cache Unit Organization 

1 .3.5 Exception Model 

The following subsections describe the PowerPC exception model and the MPC601 
implementation, respectively. 
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1.3.5.1 PowerPC Exception Model 

The PowerPC exception mechanism allows the processor to change to supervisor state as 
a result of external signals, errors, or unusual conditions arising in the execution of 
instructions. When exceptions occur, information, such as the instruction that should be 
executed after control is returned to the original program and the contents of the machine 
state register, is saved to the save/restore registers (SRRO and SRRl), program control 
passes from user to supervisor level, and software continues execution at an address 
(exception vector) predetermined for each exception. 

Although multiple exception conditions can map to a single exception vector, the specific 
condition can be determined by examining a register associated with the exception — for 
example, the DAE/source instruction service register (DSISR) and the floating-point status 
and control register (FPSCR). Additionally, specific exception conditions can be explicitly 
enabled or disabled by software. 

Although the PowerPC architecture supports out-of-order instruction dispatch, exceptions 
are handled in program order; therefore, while exception conditions may be recognized out 
of order, they are handled strictly in order. When an instruction-caused exception is 
recognized, any unexecuted instructions that appear earlier in the instruction stream, 
including any that have not yet entered execute state, are allowed to complete. Any 
exceptions, caused by those instructions are handled in order. Likewise, exceptions that are 
asynchronous and precise are recognized when they occur, but are not handled until all 
instructions currently in execute stage successfully complete execution and report their 
results. 

Unless a catastrophic condition causes a system reset or machine check exception, only one 
exception is handled at a time. If, for example, a single instruction encounters multiple 
exception conditions, those conditions are encountered sequentially. After the exception 
handler handles an exception, the instruction execution continues until the next exception 
condition is encountered. This method of recognizing and handling exception conditions 
sequentially guarantees that exceptions are recoverable. 

Exception handlers should save the information saved in SRRO and SRRl early to prevent 
the program state from being lost due to a system reset and machine check exception or to 
an instruction-caused exception in the exception handler. 
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The PowerPC architecture supports four types of exceptions: 

• Synchronous, precise — ^These are caused by instructions. All instruction-caused 
exceptions are handled precisely; that is, the machine state at the time the exception 
occurs is known and can be completely restored. This means that the precise address 
of the faulting instruction is provided to the exception handler and that neither the 
faulting instruction nor subsequent instructions in the code stream will complete 
execution. 

• Synchronous, imprecise mode — The PowerPC architecture permits the 
implementation of imprecise floating-point exceptions. The use of recoverable and 
nonrecoverable versions of this mode can be enabled or disabled be setting one of 
the FEO and FEl bits in the MSR. Note that in the MPC601 , these bits are internally 
ORed together causing all floating-point exceptions to be handled precisely. 

• Asynchronous, precise — ^The external interrupt and decrementer exceptions are 
maskable asynchronous exceptions that are handled precisely. When these 
exceptions occur, their handling is postponed until all instructions, and any 
exceptions associated with those instructions, complete execution. 

• Asynchronous, imprecise — ^There are two non-maskable asynchronous exceptions 
that are imprecise: system reset and machine check exceptions. These exceptions 
may not be recoverable, or may provide a limited degree of recoverability for 
diagnostic purpose. 

The PowerPC architecture defines several of the exceptions differentiy than the MPC6()1 
implementation. For example, the PowerPC exception model provides a unique vector for 
the trace exception; the MPC6()1 vectors trace exceptions to the run-mode exception 
handler. Other differences are noted in the following section. Section 1.3.5.2, "MPC6()1 
Exception Model." 

1.3.5.2 MPCeor Exception Model 

All MPC601 exceptions can be described as either precise or imprecise and either 
synchronous or asynchronous. Asynchronous exceptions are caused by events external to 
the processor's execution; synchronous exceptions, which are all handled precisely by the 
MPC601 , are caused by instructions. 

The MPC6()1 exception classes are shown in Table 1-1. 

Table 1-1. MPC601 Exception Classifications 



Synchronous/Asynchronous 


Precise/Imprecise 


Exception Type 


Asynchronous 


Imprecise 


Machine Checl^ 
System Reset 


Asynchronous 


Precise 


External interrupt 
Decrementer 


Synchronous 


Precise 


Instruction-caused exceptions 
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Although exceptions have other characteristics as well, such as whether they are maskable 
or nonmaskable, the distinctions shown in Table 1-1 define categories of exceptions that the 
MPC6()1 handles uniquely. Note that Table 1-1 includes no synchronous imprecise 
instructions. While the PowerPC architecture supports imprecise handling of floating-point 
exceptions, this functionality is not implemented in the MPC601. 

The MPC60rs exceptions, and conditions that cause them, are listed in Table 1-2. 
Exceptions that are specific to the MPC601 are indicated. 

Table 1-2. Exceptions and Conditions 



Exception 
Type 


Vector Offset 
(hex) 


Causing Conditions 


Reserved 


00000 


— 


System reset 


00100 




A system reset is caused by the assertion of either SRESET or HRESET. 


Maciiine check 


00200 


A machine check Is caused by the assertion of the 1 bA signal. 


Data access 


00300 


The cause of a data access exception can be determined by the bit settings In 
the DSISR, listed as follows: 

I Set If the translation of an attempted access is not found in the primary 
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in 
the range of a BAT register; otherwise cleared. 

4 Set if a memory access is not permitted by the page or BAT protection 
mechanism described In Chapter 6, "Memory Management Unit"; otherwise 
cleared. 

5 Set If the access was to an I/O segment (SR[T] =1) by a load/store with 
reservation instruction; otherwise cleared. 

6 Set for a store operation and cleared for a load operation. 

9 Set if an EA matches the address In the DABR while in one of the three 
compare modes. 

II Set if eciwx or ecowx is used and EAR[E] is cleared. 


Instruction 
access 


00400 


An instruction access exception Is caused when an instruction fetch cannot be 
performed for any of the following reasons: 

• The effective address cannot be translated. That is, there is a page fault for 
this portion of the translation, so an Instruction access exception must be 
taken to retrieve the translation from a storage device such as a hard disk 
drive. 

• The fetch access is to an I/O segment. 

• The fetch access violates memory protection. If the K bits In the segment 
register and the PP bits In the PTE are set to prohibit read access, 
instructions cannot be fetched from this location. 


External 
interrupt 


00500 


An external interrupt occurs when the TFTT signal is asserted. 
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Table 1-2. Exceptions and Conditions (Continued) 



Exception 
Type 


Vector Offset 
(hex) 


Causing Conditions 


Alignment 


00600 


An alignment exception is caused whien the MPC601 cannot perform a memory 
access for one of thie following reasons: 

• The operand of a floating-point load or store or load/store with reservation 
operation is in an I/O segment {SR[T]=1). 

• An Iscbx instruction crosses a page boundary. 

• The operand of a load or store (including string loads and stores) crosses a 
protection boundary. 

• The operand of an Imw or stmw instruction crosses a segment or BAT 
boundary. 

• The operand of a Data Cache Block Set to Zero (dcbz) instruction is in a 
page specified as write-through or cache-inhibited for a page-address 
translation access. 


Program 


00700 


A program exception is caused by one of the following exception conditions, 
which correspond to bit settings in SRR1 and arise during execution of an 
instruction: 

• Floating-point enabled exception— A floating-point enabled exception 
condition is generated when the following condition is met: 

(MSR[FEO] 1 MSR[FE1]) & FPSCR[FEX] is 1. 
FPSCR[FEX] is set by the execution of a floating-point instruction that 
causes an enabled exception or by the execution of a "move to FPSCR" 
instruction that results in both an exception condition bit and its 
corresponding enable bit being set in the FPSCR. 

• Illegal instruction — ^An illegal instruction program exception is generated 
when execution of an instruction is attempted with an illegal opcode or illegal 
combination of opcode and extended opcode fields, or when execution of an 
optional instruction not provided in the MPC601 is attempted (these do not 
include those optional instruction that are treated as no-ops). 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the 
MSR register user privilege bit, MSR[PR], is set. In the MPC601 , this 
exception is generated for mtspr or mfspr with an invalid SPR field if 
SPR[0]=1 and MSR[PR]=1 . This may not be true for all PowerPC 
processors. 

• Trap — A trap type program exception is generated when any of the 
conditions specified in a trap instruction is met. 

• Illegal operations— The MPC601 takes illegal operation program exceptions 
for unimplemented PowerPC instructions. The PowerPC instruction set is 
described in Chapter 3, "Addressing Modes and Instruction Set Summary." 


Floating-point 
unavailable 


00800 


A floating-point unavailable exception is caused by an attempt to execute a 
floating-point instruction (including floating-point load, store, and move 
instructions) and the floating-point available bit is disabled, (MSR[FP]=0). 


Decrementer 


00900 


The decrementer exception occurs when the most significant bit of the 
decrementer (DEC) register transitions from to 1 . 


I/O error 


OOAOO 


An I/O error exception is taken only when an operation to an I/O segment fails 
(such a failure is indicated to the MPC601 by a particular bus reply packet). If an 
I/O error exception is taken on a memory access directed to an I/O segment, the 
SRRO contains the address of the instruction following the offending instruction. 
Note that this exception may not be implemented in other PowerPC processors. 


Reserved 


OOBOO 


— 


System call 


OOCOO 


A system call exception occurs when a System Call (sc) instruction is executed. 
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Table 1-2. Exceptions and Conditions (Continued) 



Exception 
TVpe 


Vector Offset 
(hex) 


Causing Conditions 


Reserved 


OOEOO 


Other PowerPC processors may use this vector for floating-point assist 
exceptions. 


Reserved 


00E10-00FFF 


— 


Reserved 


01000-02FFF 


Reserved, implementation-specific 


Run mode 
exception 


02000 


The run mode exception is taken depending on the settings of the HID1 register 

and the MSR[SE] bit. 

The following modes correspond with bit settings in the HID1 register: 

• Normal run mode — no address break points are specified, and the MPC601 
executes from zero to three instructions per cycle 

• Single instruction step mode — One instruction is processed at a time. The 
appropriate break action is taken after an instruction is executed and the 
processor quiesces. 

• Limited instruction address compare — ^The MPG601 runs at full speed (in 
parallel) until the EA of the instruction being decoded matches the EA 
contained in HID2. Addresses for branch instructions and floating-point 
instructions may never be detected. 

The following mode is taken when the MSR[SE] bit is set. 

• MSR[SE] trace mode — Note that in other PowerPC implementations, the 
trace exception is a separate exception with its own vector x'OODOO'. 


Reserved 


02001-03FFF 


— 



1.3.6 Memory Management 

The following subsections describe the PowerPC memory management in general, and the 
specific MPC601 implementation, respectively. 

1.3.6.1 PowerPC Memory Management 

The primary functions of the MMU are to translate logical (effective) addresses to physical 
addresses for memory accesses, I/O accesses (most I/O accesses are assumed to be 
memory-mapped), and I/O controller interface accesses, and to provide access protection 
on blocks and pages of memory. 

There are three types of accesses generated by the MPC601 that require address translation: 
instruction accesses, data accesses to memory generated by load and store instructions, and 
I/O controller interface accesses generated by load and store instructions. 

The PowerPC MMU and exception model support demand-paged virtual memory. Virtual 
memory management permits execution of programs larger than the size of physical 
memory; demand-paged implies that individual pages are loaded into physical memory 
from backing storage only when they are first accessed by an executing program. 

PowerPC memory management differs for 32- and 64-bit implementations. Address 
translations are enabled by setting bits in the MSR — MSR[IT] enables instruction 
translations and MSR[DT] enables data translations. 
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1.3.6.2 MPC601 Memory Management 

The MPC601 MMU provides 4-Gbytes of logical address space accessible to supervisor 
and user programs with a 4-Kbyte page size and 256-Mbyte segment size. Block sizes 
range from 128 Kbyte to 8 Mbyte and are software selectable. In addition, the MPC6()1 
uses an interim 52-bit virtual address and hashed page tables in the generation of 32-bit 
physical addresses. 

A UTLB provides address translation in parallel with the on-chip cache access, incurring 
no additional time penalty. The UTLB is a 256-entry, two-way set-associative cache that 
contains instruction and data address translations. The MPC601 provides hardware table 
search capability on UTLB misses. Supervisor software can invalidate UTLB entries (both 
in the set) selectively. In addition, UTLB control instructions can optionally be broadcast 
on the external interface for remote invalidations. 

The MPC601 also provides a four-entry BTLB that maintains address translations for 
blocks of memory. These entries define blocks that can vary from 128 Kbytes to 8 Mbytes. 
The BTLB is maintained by system software. 

To accelerate the instruction unit operation, the MPC6()1 uses a four-entry ITLB. The ITLB 
contains up to four copies of the most recently used instruction address translations (page 
or block) providing the instruction unit access to the most recently used translations without 
requiring the UTLB or BTLB. The ITLB, including coherency, is maintained in hardware 
and uses an LRU replacement algorithm. 

The MPC601 has a high-bandwidth, 64-bit data bus and a 32-bit address bus. The MPC601 
interface protocol allows multiple masters to compete for system resources through a 
central external arbiter. Additionally, on-chip snooping logic maintains cache coherency in 
multiprocessor applications. The MPC601 supports single-beat and burst data transfers for 
memory accesses; it also supports both memory-mapped I/O and I/O controller interface 
addressing. 

The MPC601 MMU relies on the exception processing mechanism for the implementation 
of the paged virtual memory environment and for enforcing protection of designated 
memory areas. Exception processing is described in Chapter 5, "Exceptions." 
Section 2.3.1, "Machine State Register (MSR)," describes the MSR of the MPC601, which 
controls some of the critical functionality of the MMU. 

The hashed page table is a variable-sized data structure that defines the mapping between 
virtual page numbers and physical page numbers. The page table size is a power of 2, and 
its starting address is a multiple of its size. 

The page table contains a number of page table entry groups (PTEGs). A PTEG contains 
eight page table entries (PTEs) of eight bytes each; therefore each PTEG is 64 bytes long. 
PTEG addresses are entry points for table search operations. Figure 6-16 shows two PTEG 
addresses (PTEGaddrl and PTEGaddr2) where a given PTE may reside. 
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1.3.7 Instruction Timing 

The PowerPC architecture is designed to minimize instruction latencies while maximizing 
overall instruction throughput. Although many of the instructions execute in a single clock 
cycle, in many cases overall instruction throughput is significantly greater than one 
instruction per clock cycle. Because the PowerPC architecture can be applied to such a 
wide variety of implementations, instruction timing details vary accordingly. 

The MPC601 processor has been designed to minimize average instruction execution 
latency. Latency is defined as the number of clock cycles necessary to execute an 
instruction and make ready the results of that execution for a subsequent instruction. For 
the majority of instructions in the MPC601, this can be simplified to include only the 
execute phase for a particular instruction. However, data access instructions require 
additional clock cycles between the execute phase and the writeback phase due to memory 
latencies. 

In accordance with this definition, logical, bit-field, and most integer instructions have a 
latency of one clock cycle (for example, results for these instructions are ready for use on 
the next clock cycle after issue). Other instructions, such as the integer multiply, require 
more than one clock cycle to complete execution. 

Effective throughput of more than one instruction per clock cycle can be realized by the 
many performance features in the MPC601 including pipelining, superscalar instruction 
issue, branch acceleration, and multiple execution units that operate independently and in 
parallel. 

Many of the execution units on the MPC601 are said to be pipelined. This implies that the 
particular execution unit is broken into stages. Each stage performs a specific step, which 
contributes to the overall execution of an instruction. The pipelined design is analogous to 
an assembly line where workers perform a specific task and pass the partially complete 
product to the next worker. 

When an instruction is issued to a pipelined execution unit, the first stage in the pipeline 
begins its designated work on that instruction. As an instruction is passed from one stage in 
the pipeline to the next, evacuated stages may accept new instructions. This design allows 
a single execution unit to be working on several different instructions simultaneously. Once 
the pipeline has been filled with instructions, the execution unit completes a multi-cycle 
instruction every clock. 

If the number of stages in each pipeline is equal to the total latency in clock cycles of its 
respective execution unit, the processor can continuously issue instructions to the same 
execution unit without stalling. Thus, when enough instructions have been issued to an 
execution unit to fill its pipeline, the first instruction will have completed execution and 
exited the pipeline, allowing subsequent instructions to be issued into the tail of the pipeline 
without interruption. 
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1.3.8 System Interface 

The system interface is specific for each PowerPC processor; however, processor designs 
provide the same basic set of signals, with differences depending largely upon other design 
factors. 

The MPC601 provides a versatile system interface that allows for a wide range of 
implementations. The interface includes a 32-bit address bus, a 64-bit data bus, and 52 
control and infonnation signals (see Figure 1-5). The system interface allows for address- 
only transactions as well as address and data transactions. The MPC601 control and 
information signals include the address arbitration, address start, address transfer, transfer 
attribute, address termination, data arbitration, data transfer, data termination, and 
processor state signals. Test and control signals provide diagnostics for selected internal 
circuitry. 



ADDRESS -*- 



ADDRESS ARBITRATION -*- 

ADDRESS START -*- 

ADDRESS TRANSFER -^ 



TRANSFER ATTRIBUTE -*- 

ADDRESS TERMINATION -*- 

CLOCKS -*- 



MPC601 
Processor 



-^DATA 



-^ DATA ARBITRATION 



-♦►DATA TRANSFER 
-^ DATA TERMINATION 
-^PROCESSOR STATE 



-»-TEST AND CONTROL 



I 5^ 
+3.6 V - 

Figure 1-5. System Interface 

The system interface supports bus pipelining, which allows the address tenure of one 
transaction to overlap the data tenure of another. The extent of the pipelining depends on 
external arbitration and control circuitry. Similarly, the MPC601 supports split-bus 
transactions for systems with multiple potential bus masters — one device can have 
mastership of the address bus while another has mastership of the data bus. Allowing 
multiple bus transactions to occur simultaneously increases the available bus bandwidth for 
other activity and as a result, improves performance. 

The MPC601 supports multiple masters through a bus arbitration scheme that allows 
various devices to compete for the shared bus resource. The arbitration logic can implement 
priority protocols, such as fairness, and can park masters to avoid arbitration overhead. The 
MESI protocol ensures coherency among multiple devices and system memory. Also, the 
MPC60rs on-chip cache and UTLBs and optional second-level caches can be controlled 
externally. Software support for atomic memory operations minimizes the effects of data 
dependencies in multiple processor implementations. 
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The MPC601 clocking structure allows the processor to operate at an integer multiple of 
the bus frequency. 

The following sections describe the MPC601 bus support for memory and I/O controller 
interface operations. Note that some signals perform different functions depending upon 
the addressing protocol used. 

1.3.8.1 Memory Accesses 

Memory accesses allow transfer sizes of 8, 16, 24, 32, 40, 48, 56, or 64 bits in one bus clock 
cycle. Data transfers occur in either single-beat transactions or four-beat burst transactions. 
A single beat transaction transfers as much as 64 bits. Single-beat transactions are caused 
by non-cached accesses that access memory directly (that is, reads and writes when caching 
is disabled, cache-inhibited accesses, and stores in write-through mode). Burst transactions, 
which always transfer an entire cache sector (32 bytes), are initiated when a sector in the 
cache is read from or written to memory. Additionally, the MPC601 supports address-only 
transactions used to invalidate entries in other processors' TLBs and caches. 

1 .3.8.2 I/O Controller Interface Operations 

Both memory and I/O accesses can use the same bus transfer protocols. The MPC601 also 
has the ability to define memory areas as I/O controller interface areas. Accesses to the I/O 
controller interface redefine the function of some of the address transfer and transfer 
attribute signals and add control to facilitate transfers between the MPC601 and specific I/O 
devices. I/O controller interface transactions provide multiple transaction operations for 
variably-sized data transfers (1 to 128 bytes) and support a split request/response protocol. 
The distinction between the two types of transfers is made with separate signals — T^ for 
memory-mapped accesses and XATS for I/O controller interface accesses. Refer to 
Chapter 9, "System Interface Operation," for more information. 

1.3.8.3 MPC601 Signals 

The MPC601 signals are grouped as follows: 

• Address arbitration signals — ^The MPC601 uses these signals to arbitrate for address 
bus mastership. 

• Address transfer start signals — ^These signals indicate that a bus master has begun a 
transaction on the address bus. 

• Address transfer signals — ^These signals, which consist of the address bus, address 
parity, and address parity error signals, are used to transfer the address and to ensure 
the integrity of the transfer. 

• Transfer attribute signals — ^These signals provide information about the type of 
transfer, such as the transfer size and whether the transaction is bursted, write- 
through, or cache-inhibited. 

• Address transfer termination signals — ^These signals are used to acknowledge the 
end of the address phase of the transaction. They also indicate whether a condition 
exists that requires the address phase to be repeated. 
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Data arbitration signals — ^The MPC601 uses these signals to arbitrate for data bus 
mastership. 

Data transfer signals — ^These signals, which consist of the data bus, data parity, and 
data parity error signals, are used to transfer the data and to ensure the integrity of 
the transfer. 

Data transfer termination signals — Data termination signals are required after each 
data beat in a data transfer. In a single-beat transaction, the data termination signals 
also indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the final 
data beat. They also indicate whether a condition exists that requires the data phase 
to be repeated. 

System status signals — ^These signals include the interrupt signal, checkstop signals, 
and both soft- and hard-reset signals. These signals are used to interrupt and, under 
various conditions, to reset the processor. 

Processor state signals — ^These two signals are used to set the reservation coherency 
bit and set the size of the MPC601 's output buffers. 

Miscellaneous signals — ^These signals provide information about the state of the 
reservation coherency bit and the size of the MPC60rs output buffers. 

COP interface signals — The common on-chip processor (COP) unit is the master 
clock control unit and it provides a serial interface to the system for performing 
built-in self test (BIST). 

Test interface signals — These signals are used for internal testing. 

Clock signals — ^These signals determine the system clock frequency. These signals 
can also be used to synchronize multiprocessor systems. 

NOTE 

A bar over a signal name indicates that the signal is active 
low — ^for example, ARTRY (address retry) and TS (transfer 
start). Active-low signals are referred to as asserted (active) 
when they are low and negated when they are high. Signals that 
are not active-low, such as AP0-AP3 (address bus parity 
signals) and TT0-TT4 (transfer type signals) are referred to as 
asserted when they are high and negated when they are low. 

1.3.8.4 Signal Configuration 

Figure 1-6 illustrates the MPC601 microprocessor's pin configuration, showing how the 
signals are grouped. 
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Figure 1-6. MPC601 Signal Groups 

1.3.8.5 Real-Time Clock Facility 

The real-time clock (RTC) facility, which is specific to the MPC601, provides a high- 
resolution measure of real time to provide time of day and date with a calendar range of 
136.19 years. The RTC consists of two registers — the RTC upper (RTCU) register and the 
RTC lower (RTCL) register. The RTCU register maintains the number of seconds from a 
point in time specified by software. The RTCL register counts nanoseconds. The contents 
of either register may be copied to any GPR. 
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Chapter 2 

Registers and Data Types 

This chapter describes the MPC6()rs register organization, how these registers are 
accessed, and how data is formatted in these registers. 

The MPC601 always operates in one of three distinct states — ^reset state, checkstop state, 
and normal execution state, which includes both user- and supervisor-level operations. The 
three states are described as follows: 

• Reset state — In the reset state all processor instruction execution is aborted, registers 
are initialized appropriately, and external signals are placed in the high-impedance 
state. 

• Normal instruction execution state — When the MPC601 is in the normal execution 
state, it operates at one of two privilege levels — user mode, and supervisor mode, 
which can be accessed when an exception is taken. This access privilege determines 
which instructions and which of the registers software can access. 

• Checkstop state — When a processor is in the checkstop state, instruction processing 
is suspended and generally cannot be restarted without resetting the processor. The 
checkstop state is provided to help diagnose problems. 

The PowerPC architecture defines register-to-register operations for all computational 
instructions. Source operands for these instructions are accessed from the on-chip registers 
or are provided as immediate values embedded in the opcode. The three-register instruction 
format allows specification of a different target register from the two source registers, thus 
preserving the original data for use by other instructions and reducing the number of 
instructions required for certain operations. Data is transferred between memory and 
registers with explicit load and store instructions only. 

2.1 Normal Instruction Execution State 

During normal execution, a program can access the registers, shown in Figure 2- 1 , 
depending on the program's access privilege (supervisor or user, determined by the 
privilege-level (PR) bit in the machine state register (MSR)). Note that registers such as the 
general-purpose registers (GPRs) and floating-point registers (FPRs) are accessed through 
operands that are part of the instructions. Access to registers can be explicit (that is, through 
the use of specific instructions for that purpose such as Move to Special-Purpose Register 
(mtspr) and Move from Special-Purpose Register (mfspr) instructions) or implicitly as the 
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part of the execution of an instruction. Some registers are accessed both exphcitly and 
implicitly. 

The numbers to the left of the SPRs indicate the number that is used in the syntax of the 
instruction operands to access the register. 
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SRR1— Save and Restore Register 1 
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^ MPC601-only registers. These registers may not be supported by other PowerPC processors. 

^ These registers are Implemented differently on other PowerPC processors. 

^ The RTCU and RTCL registers can only be written in supervisor mode. 

"The DEC can be read by user programs by specifying SPRS in the mfspr instruction (for POWER compatibility) 

Figure 2-1. Programming IVIodel— Registers 
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The following paragraphs discuss the MPC6()1 's user- and supervisor-level registers. 

• User-level registers — ^The user-level registers can be accessed by all software with 
either user or supervisor privileges. These include the following: 

— General-purpose registers (GPRs). The general-purpose register file consists of 
thirty-two, 32-bit GPRs designated as GPR()-GPR31 . This register file serves as 
the data source or destination for all integer instructions and provides addresses 
for all memory-access instructions. See Section 2.2.1, "General Purpose 
Registers (GPRs)," for more information. 

— Floating-point registers (FPRs). The floating-point register file consists of thirty- 
two, 64-bit FPRs designated as FPRO-FPRBl, which serve as the data source or 
destination for all floating-point instructions. These registers can contain data 
objects of either single- or double-precision floating-point formats. The floating- 
point register file is part of the FPU. For more information, see Section 2.2.2, 
"Floating-Point Registers (FPRs)." 

— Floating-point status and control register (FPSCR). The FPSCR is a user-control 
register in the FPU. It contains all floating-point exception signal bits, exception 
summary bits, exception enable bits, and rounding control bits needed for 
compUance with the IEEE 754 standard. For more information, see 

Section 2.2.3, "Floating-Point Status and Control Register (FPSCR)." 

— Condition register (CR). The condition register is a 32-bit register, divided into 
eight 4-bit fields, CR()-CR7, that reflects the result of certain arithmetic 
operations and provides a mechanism for testing and branching. For more 
information, see Section 2.2.4, "Condition Register (CR)." 

The remaining user-level registers are SPRs. Note however that while the PowerPC 
architecture provides a separate mechanism for accessing SPRs, this mechanism is 
not the usual method for accessing user-level SPRs. 

— MQ register (MQ). The MQ register is a MPC601 -specific, 32-bit register used 
as a register extension to accommodate the product for the multiply instructions 
and the dividend for the divide instructions. It is also used as an operand of long 
rotate and shift instructions. This register is provided for compatibility with 
POWER architecture, and is not part of the PowerPC architecture. For more 
information, see Section 2.2.5.1, "MQ Register (MQ)." The MQ register is 
typically accessed implicitly as part of executing a computational instruction. 

— Integer exception register (XER). The XER is a 32-bit register that indicates such 
things as overflow and carries for integer operations. For more information, see 
Section 2.2.5.2, "Integer Exception Register (XER)." 

— Real-time clock (RTC) registers— RTCU and RTCL (RTC upper and RTC 
lower). The RTCU register maintains the number of seconds from a time 
specified by software. The RTCL register maintains a fraction of the current 
second in nanoseconds. The contents of either register can be copied to any GPR. 
These registers are specific to the MPC601 . These registers are not supported in 
the PowerPC architecture, which uses the time base facility rather than a separate 
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real-time clock. For more information, see Section 2.2.5.3, "Real-Time Clock 
(RTC) Registers." 

— Link register (LR). The 32-bit link register provides the branch target address for 
the Branch Conditional to Link Register (bcIrA:) instruction, and can optionally 
be used to hold the logical address of the instruction that follows a branch and 
link instruction. Although this is an SPR, it is not typically accessed through the 
PowerPC's SPR mechanism. For more information, see Section 2.2.5.4, "Link 
Register (LR)." 

— Count register (CTR). The count register is a 32-bit register for holding a loop 
count that can be decremented during execution of appropriately coded branch 
instructions. The CTR can also provide the branch target address for the Branch 
Conditional to Count Register (bcctrjc) instruction. Although this is an SPR, it is 
not typically accessed through the PowerPC's SPR mechanism. For more 
information, see Section 2.2.5.5, "Count Register (CTR)." 

Supervisor-level registers — ^The MPC601 incorporates registers that can be 
accessed only by programs executed with supervisor privileges. These registers 
consist of the machine state register, segment registers, and supervisor SPRs, 
described as follows: 

— The machine state register (MSR), shown in Figure 2-1 1 , is a 32-bit register that 
defines the state of the processor. The MSR can be modified by the Move to 
Machine State Register (mtmsr). System Call (sc), and Return from Exception 
(rfi) instructions. It can be read by the Move from Machine State Register 
(mfmsr) instruction. Note that in other PowerPC implementations, the MSR is a 
64-bit register. 

— Segment registers. The sixteen 32-bit segment registers are present only in 32-bit 
PowerPC implementations. Figure 2-12 and Figure 2-13 show the format of a 
segment register. The fields in the segment register are interpreted differently 
depending on the value of bit 0. 

The remaining supervisor-level registers are SPRs: 

— The 32-bit DAE/source instruction service register (DSISR) defines the cause of 
data access and alignment exceptions; see Figure 2-14. For more information, 
see Section 2.3.3.2, "DAE/Source Instruction Service Register (DSISR)." 

— The data address register (DAR) is a 32-bit register shown in Figure 2-15. After 
a data access or an alignment exception, DAR is set to the effective address of a 
load or store element. For more information, see Section 2.3.3.3, "Data Address 
Register (DAR)." 

— The decrementer register (DEC) is a 32-bit decrementing counter that provides 
a mechanism for causing a decrementer exception after a programmable delay. 
In the MPC601 , the RTC provides the frequency for the DEC. In other PowerPC 
implementations, the frequency is a subdivision of the processor clock. For more 
information, see Section 2.3.3.4, "Decrementer (DEC) Register." 
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The 32-bit table search descriptor register 1 (SDRl) specifies the page table 
variables used in virtual-to-physical address translation. For more information, 
see Section 2.3.3.5, "Table Search Descriptor Register 1 (SDRl)." 

The machine status save/restore register (SRRO) is a 32-bit register that is used 
by the MPC6()1 for saving machine status on exceptions and restoring machine 
status when an rfi instruction is executed. SRRO is shown in Figure 2-18. For 
more information, see Section 2.3.3.6, "Machine Status Save/Restore Register 
(SRRO)." 

The machine status save/restore register 1 (SRRl) is a 32-bit register used to 
save machine status on exceptions and to restore machine status when an rfi 
instruction is executed. SRRl is shown in Figure 2-19. For more information, 
see Section 2.3.3.7, "Machine Status Save/Restore Register 1 (SRRl)." 

The general SPRs, SPRG()-SPRG3, are 32-bit registers provided for operating 
system use. See Figure 2-20. For more information, see Section 2.3.3.8, 
"General SPRs (SPRG0-SPRG3)." 

The external access register (EAR) is a 32-bit register that controls access to the 
external control facility through the eciwx and ecowx instructions. Note that the 
EAR register and the eciwx and ecowx instructions are an optional part of the 
PowerPC architecture and may not be supported in other PowerPC processors. 
For more information about the external control facility, see Section 2.3.3.9, 
"External Access Register (EAR)." 

The processor version register (PVR) is a 32-bit, read-only register that identifies 
the version (model) and revision level of the PowerPC processor. The contents 
of the PVR can be copied to a GPR by the Move from Special Purpose Register 
(mfspr) instruction. For more information, see Section 2,3.3.10, "Processor 
Version Register (PVR)." 

Block-address translation (BAT) registers. The MPC601 includes eight block- 
address translation registers (BATs), consisting of four pairs of BATs (B ATOU- 
BAT3U and BAT0L-BAT3L). See Figure 2-1 for a list of the SPR numbers for 
the BAT registers. Figure 2-23 and Figure 2-24 show the formats of the upper 
and lower BAT registers. Note that other PowerPC implementations have two 
sets of four BAT pairs — four sets of upper and lower IBATs (which occupy the 
space of the unified BATs in the MPC601) and four sets of upper and lower 
DB ATs (located in the subsequent eight positions at SPR numbers 536-543). 

The hardware implementation registers, HID()-HID2, HID5, and HID15 are 
provided primarily for debugging. For more information, see Section 2.3.3. 1 2. 1 , 
"Checkstop Sources and Enables Register — HIDO" through Section 2.3.3.12.5, 
"Processor Identification Register (PIR) — HID15." HID15 holds the four-bit 
processor identification tag (PID) that is useful for differentiating processors in 
multiprocessor system designs. For more information, see Section 2.3.3.12.5, 
"Processor Identification Register (PIR) — HID15." 
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Note that there are registers common to other PowerPC processors not implemented in the 
MPC601. When the MPC601 detects SPR encodings other than those defined in this 
document, it either takes a program exception (if bit of the SPR encoding is set) or it treats 
the instruction as a no-op (if bit of the SPR encoding is clear). 

2.1.1 Changing Privilege Levels 

During normal instruction execution, the processor operates using either user- or 
supervisor-level instructions and registers. Supervisor-level access is provided through the 
MPC6()1 's exception mechanism. That is, when an exception is taken, either due to an error 
or problem that needs to be serviced or deliberately through the use of a trap instruction, 
the processor begins operating in supervisor mode. The level of access is indicated by the 
privilege-level (PR) bit in the MSR. 

2.2 User-Level Registers 

This section describes in detail the registers that can be accessed by user-level software. All 
user-level registers can be accessed by supervisor-level software. 

2.2.1 General Purpose Registers (GPRs) 

Integer data is manipulated in the lU's thirty-two 32-bit GPRs shown in Figure 2-2. These 
registers are accessed as source and destination registers through operands in the 
instruction syntax. 



GPRO 



GPR1 



GPR31 



31 

Figure 2-2. General Purpose Registers (GPRs) 

All GPRs are cleared by hard reset. 

2.2.2 Floating-Point Registers (FPRs) 

The PowerPC architecture provides thirty-two, 64-bit FPRs as shown in Figure 2-3. These 
registers are accessed as source and destination registers through operands in floating-point 
instructions. Each FPR supports the double-precision, floating-point format. Every 
instruction that interprets the contents of an FPR as a floating-point value uses the double- 
precision floating-point format for this interpretation. 
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All floating-point arithmetic instructions operate on data located in FPRs and, with the 
exception of the compare instructions, place the result into an FPR. Information about the 
status of floating-point operations is placed into the floating-point status and control 
register (FPSCR) and in some cases, into CR after the completion of the operation's final 
writeback stage. For information on how CR is affected for floating-point operations, see 
Section 2.2.4, "Condition Register (CR)." 

Load and store double instructions are provided that transfer 64 bits of data between 
memory and the FPRs in the floating-point processor with no conversion. Load single 
instructions are provided to transfer and convert floating-point values in floating-point 
format from memory to the same value in double-precision floating-point format in the 
FPRs. Store single instructions are provided to transfer and convert floating-point values in 
double-precision floating-point format from the FPRs to the same value in single-precision 
floating-point format in memory. 

Single- and double-precision arithmetic instructions accept values from the FPRs in 
double-precision format. For single-precision arithmetic instructions, all input values must 
be representable in single-precision format; otherwise, the result placed into the target FPR 
and the setting of status bits in the FPSCR and in the condition register are undefined. 

The MPC601 's floating-point arithmetic instructions produce intermediate results that may 
be regarded as infinitely precise. After normalization or denormalization, if the precision of 
the intermediate result cannot be represented in the destination format (either 32-bit or 64- 
bit) then it must be rounded. The final result is then placed into the FPR in the double- 
precision format. 



FPRO 



FPR1 



FPR31 



63 

Figure 2-3. Floating-Point Registers (FPRs) 

All FPRs are cleared by hard reset. 

2.2.3 Floating-Point Status and Control Register (FPSCR) 

The FPSCR, shown in Figure 2-4, controls the handling of floating-point exceptions and 
records status resulting from the floating-point operations. Bits Q-23 are status bits. Bits 
24-31 are control bits. Bits in the FPSCR are updated after an operation's final writeback 
stage. 
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The floating-point exception condition bits in the FPSCR are bits ()-12 and 21-23 and are 
sticky, except for the floating-point enabled exception summary (FEX) and floating-point 
invalid operation exception summary (VX). That is, once set sticky bits remain set until 
they are cleared by an mcrfs, mtfsfi, mtfsf, or mtfsbO instruction. 

FEX and VX are the logical ORs of other FPSCR bits. Therefore these two bits are not 
listed among the FPSCR bits directly affected by the various instructions. 



FPSCR 



nil Reserved 



VXID 

VXIS 

VXSNAN 




— 
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VXZDZ 
VXIMZ 
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19 20 21 22 23 24 25 26 27 28 29 30 31 



Figure 2-4. Floating-Point Status and Control Register (FPSCR) 

A listing of FPSCR bit settings is shown in Table 2-1 . 

Table 2-1 . FPSCR Bit Settings 



Bit(s) 


Name 


Description 





FX 


Floating-point exception summary (FX). Every floating-point instruction implicitly sets 
FPSCR[i^X] if that instruction causes any of the floating-point exception bits in the FPSCR to 
transition from to 1 . The mcrfs instruction implicitly clears FPSCR[FX] if the FPSCR field 
containing FPSCR[FX] is copied. The mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions can set 
or clear FPSCR[FX] explicitly. This is a sticky bit. 


1 


FEX 


Floating-point enabled exception summary (FEX). This bit signals the occurrence of any of the 
enabled exception conditions. It is the logical OR of all the floating-point exception bits 
masked with their respective enable bits. The mcrfs instruction implicitly clears FPSCR[FEX] 
if the result of the logical OR described above becomes zero. The mtfsf, mtfsfi, mtfsbO, and 
mtfsbl instructions cannot set or clear FPSCR[FEX] explicitly. This is not a sticky bit. 


2 


VX 


Floating-point invalid operation exception summary (VX). This bit signals the occurrence of 
any invalid operation exception. It is the logical OR of all of the invalid operation exceptions. 
The mcrfs instruction implicitly clears FPSCR[VX] if the result of the logical OR described 
above becomes zero. The mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions cannot set or clear 
FPSCR[VX] explicitly. This is not a sticky bit. 


3 


OX 


Floating-point overflow exception (OX). This is a sticky bit. See Section 5.4.7.4, "Overflow 
Exception Condition." 


4 


UX 


Floating-point underflow exception (UX). This is a sticky bit. See Section 5.4.7.5, "Underflow 
Exception Condition." 


5 


ZX 


Floating-point zero divide exception (ZX). This is a sticky bit. See Section 5.4.7.3, "Zero Divide 
Exception Condition." 


6 


XX 


Floating-point inexact exception (XX). This is a sticky bit. See Section 5.4.7.6, "inexact 
Exception Condition." 
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Table 2-1 . FPSCR Bit Settings (Continued) 



Bit(s) 


Name 


Description 


7 


VXSNAN 


Floating-point invalid operation exception for SNaN (VXSNAN). This is a sticl<y bit. See 
Section 5.4.7.2, "Invalid Operation Exception Conditions." 


8 


VXISI 


Floating-point invalid operation exception for <»-oo (VXISI). This is a sticky bit. See Section 
5.4.7.2, "Invalid Operation Exception Conditions." 


9 


VXIDI 


Floating-point invalid operation exception for oo/c« (VXIDI). This is a sticky bit. See Section 
5.4.7.2, "Invalid Operation Exception Conditions." 


10 


VXZDZ 


Floating-point invalid operation exception for 0/0 (VXZDZ). This is a sticky bit. See Section 
5.4.7.2, "Invalid Operation Exception Conditions." 


11 


VXIMZ 


Floating-point invalid operation exception for «.*0 (VXItvIZ). This is a sticky bit. See Section 
5.4.7.2, "Invalid Operation Exception Conditions." 


12 


VXVC 


Floating-point invalid operation exception for invalid compare (VXVC). This is a sticky bit. See 
Section 5.4.7.2, "Invalid Operation Exception Conditions." 


13 


FR 


Floating-point fraction rounded (FR). The last floating-point instruction that potentially rounded 
the intermediate result incremented the fraction. (See Section 2.4.9.6, "Rounding.") This bit is 
not sticky. 


14 


Fl 


Floating-point fraction inexact (Fl). The last floating-point instruction that potentially rounded 
the intermediate result produced an inexact fraction or a disabled exponent overflow. (See 
Section 2.4.9.6, "Rounding.") This bit is not sticky. 


15-19 


FPRF 


Floating-point result flags (FPRF). This field is based on the value placed into the target 

register even if that value is undefined. Refer to Table 2-2 for specific bit settings. 

1 5 Floating-point result class descriptor (C). Floating-point instructions other than the 
compare instructions may set this bit with the FPCC bits, to indicate the class of the 
result. 

1 6-1 9 Floating-point condition code (FPCC). Floating-point compare instructions always 
set one of the FPCC bits to one and the other three FPCC bits to zero. Other 
floating-point instructions may set the FPCC bits with the C bit, to indicate the class 
of the result. Note that in this case the high-order three bits of the FPCC retain their 
relational significance indicating that the value is less than, greater than, or equal to 
zero. 

1 6 Floating-point less than or negative (FL or <) 

1 7 Floating-point greater than or positive (FG or >) 

18 Floating-point equal or zero (FE or =) 

1 9 Floating-point unordered or NaN (FU or?) 


20 


— 


Reserved 


21 


VXSOFT 


Not implemented in the [v1PC601 . Some implementations use this as the floating-point invalid 
operation exception for software request (VXSOFT). This bit can be altered only by the mcrfs, 
mtfsfi, mtfsf , mtfsbO, or mtfsbl instructions. The purpose of VXSOFT is to allow software to 
cause an invalid operation condition for a condition that is not necessarily associated with the 
execution of a floating-point instruction. For example, it might be set by a program that 
computes a square root if the source operand is negative. This is a sticky bit. See Section 
5.4.7.2, "Invalid Operation Exception Conditions." 


22 


VXSQRT 


Not implemented in the MPC601 . Some implementations use this as the floating-point invalid 
operation exception for invalid square root (VXSQRT). This is a sticky bit. This guarantees that 
software can simulate fsqrl and frsqrte, and to provide a consistent interface to handle 
exceptions caused by square-root operations. See Section 5.4.7.2, "Invalid Operation 
Exception Conditions." 
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Table 2-1 . FPSCR Bit Settings (Continued) 



Bit(s) 


Name 


Description 


23 


VXCVI 


Floating-point invalid operation exception for invalid integer convert (VXCVI). This is a sticky 
bit. See Section 5.4.7.2, "Invalid Operation Exception Conditions." 


24 


VE 


Floating-point invalid operation exception enable (VE). See Section 5.4.7.2, "Invalid Operation 
Exception Conditions." 


25 


OE 


Floating-point overflow exception enable (OE). See Section 5.4.7.4, "Overflow Exception 
Condition." 


26 


UE 


Floating-point underflow exception enable (UE). This bit should not be used to determine 
whether denormalization should be performed on floating-point stores. See Section 5.4.7.5, 
"Underflow Exception Condition." 


27 


ZE 


Floating-point zero divide exception enable (ZE). See Section 5.4.7.3, "Zero Divide Exception 
Condition." 


28 


XE 


Floating-point inexact exception enable (XE). See Section 5.4.7.6, "Inexact Exception 
Condition." 


29 


— 


Reserved. This bit may be implemented as the non-IEEE mode bit (Nl) in other PowerPC im- 
plementations. 


30-31 


RN 


Floating-point rounding control (RN). See Section 2.4.9.6, "Rounding." 

00 Round to nearest 

01 Round toward zero 

1 Round toward +inf inity 

1 1 Round toward -infinity 



Table 2-2 illustrates the floating-point result flags used by the MPC601. The result flags 
correspond to FPSCR bits 15-19. 

Table 2-2. Floating-Point Result Flags in FPSCR 



Result Flags 

(Bits 15-19) 

C<>=? 


Result value class 


10001 


Quiet NaN 


01001 


- Infinity 


01000 


- Normalized number 


11000 


- Denormalized number 


10010 


-Zero 


00010 


+ Zero 


10100 


+ Denormalized number 


00100 


+Normalized number 


00101 


+lnfinity 



The FPSCR is cleared by hard reset. 
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2.2.4 Condition Register (CR) 

The condition register (CR) is a 32-bit register that reflects the result of certain operations 
and provides a mechanism for testing and branching. The bits in the CR are grouped into 
eight 4-bit tields, CR()-CR7, as shown in Figure 2-5. 



CRO 


CR1 


CR2 


CR3 


CR4 


CR5 


CR6 


CR7 



34 7 8 1112 1516 1920 2324 2728 31 

Figure 2-5. Condition Register (CR) 

The CR tields can be set in one of the following ways: 

• Specified fields of the CR can be set by a move instruction (mtcrf, or mcrfs) to the 
CR from a GPR. 

• Specified fields of the CR can be moved from one CRjc field to another with the 
mcrf instruction. 

• A specified field of the CR can be set by a move instruction (mcrxr) to the CR from 
the XER. 

• Condition register logical instructions can be used to perform logical operations on 
specified bits in the condition register. 

• CRO can be the implicit result of an integer operation. 

• CRl can be the implicit result of a floating-point operation. 

• A specified CR tield can be the explicit result of either an integer or floating-point 
compare instruction. 

Instructions are provided to test individual CR bits. The condition register is cleared by 
hard reset. 

2.2.4.1 Condition Register CRO Field Definition 

In most integer instructions, when the record bit, Re, is set, CRO is generated by an 
algebraic comparison of the result to zero. The integer arithmetic and logical instructions 
(addic, andi., and andis.) generate these four bits in CRO implicitly. These bits are shown 
in Table 2-3. In the descriptions below, the result refers to the 32-bit value placed into the 
target register. If any portion of the result is undefined, the value placed in the CRO field is 
undefined. 
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Table 2-3. Bit Settings for CRO Field of CR 



CRO 
Bit 



Description 



Negative (LT) — ^Ttiis bit is set wlien tiie result is negative. 



Positive (GT) — ^Thiis bit is set wtien tiie result is positive (and not zero). 



Zero (EQ)— Tills bit is set wiien tiie result Is zero. 



Summary overflow (SO) — ^Tiiis is a copy of tiie final state of XER[SO] at tiie completion of ttie instruction. 



2.2.4.2 Condition Register CR1 Field Definition 

In all floating-point instructions except mcrfs, fcmpu, and fcmpo, when record option is 
specified, CRl is copied from bits 0-3 of the floating-point status and control register 
(FPSCR). For more information about the FPSCR, see Section 2.2.3, "Floating-Point 
Status and Control Register (FPSCR)." The bit settings for the CRl field are shown in 
Table 2-4. 

Table 2-4. Bit Settings for CRl Field of CR 



CR1 
Bit 


Description 


4 


Floating-point exception (FX)— Tliis is a copy of the final state of FPSCR[FX] at tiie completion of the 
instruction. 


5 


Floating-point enabled exception (FEX)— This is a copy of the final state of FPSCR[FEX] at the 
completion of the instruction. 


6 


Floating-point invalid exception (VX)— This is a copy of the final state of FPSCR[VX] at the completion of 
the instruction. 


7 


Floating-point overflow exception (OX)— This is a copy of the final state of FPSCR[OX] at the completion 
of the instruction. 



2.2.4.3 Condition Register CR/7 Field— Compare Instruction 

When a specified CR field is set by a compare instruction, the bits of the specified field are 
interpreted, as shown in Table 2-5. A condition register field can also be accessed by the 
mfcr, mcrf, and mtcrf instructions. 
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Table 2-5. CRn Field Bit Settings for Compare Instructions 



CRn 
Bit* 


Description 





Less than, Floating-point less than (LT, FL). 

For integer compare instructions, (rA) < SIMM, UIMM, or (rB) (algebraic comparison) or (rA) SIMM, 

UIMM, or (rB) (logical comparison). 

For floating-point compare instructions, (frA) < (frB). 


1 


Greater than, floating-point greater than (GT, FG). 

For integer compare instructions, (rA) > SIMM, UIMM, or (rB) (algebraic comparison) or (rA) SIMM, 

UIMM, or (rB) (logical comparison). 

For floating-point compare instructions, (frA) > (frB). 


2 


Equal, floating-point equal (EQ, FE). 

For integer compare instructions, (rA) = SIMM, UIMM, or (rB). 

For floating-point compare instructions, (frA) = (frB). 


3 


Summary overflow, floating-point unordered (SO, FU). 

For integer compare instructions, this is a copy of the final state of XER[SO] at the completion of the 

instruction. 

For floating-point compare instructions, one or both of (frA) and (frB) is not a number (NaN). 



*Here, the bit indicates the bit number in any one of the four-bit subfields, CR0-CR7. 

2.2.5 User-Level SPRs 

User-level SPRs can be accessed by either user- or supervisor-level instructions. Typically, 
these registers are accessed implicitly though the encoding of the instruction rather than 
explicitly through the Move to Special Purpose Register (mtspr) and Move from Special 
Purpose Register (mfspr) instructions. Some SPRs are implementation-specific; some 
SPRs in the MPC601 may not be implemented in other PowerPC processors, or may not be 
implemented in the same way in other PowerPC processors. 

For registers with reserved bits, implementations return zeros or return the value last 
written to those bits. Table 2-6 summarizes how the MPC601 treats the undefined bits in 
the user-level SPRs. 

Table 2-6. Undefined Bits in User-Level SPRs 



Register 


Value Returned 
for Undefined Bits 


XER 


Zero 



Note that some SPR bits are reserved — the results of writing to and reading from these bits 
are undefined. Some of these bits are used by other PowerPC implementations. 

The RTCL register is defined as 32 bits, but the lowest-order seven bits are not 
implemented. Those bits are reserved, and zeroes are loaded into the respective bit 
positions of the target register when the RTCL is read. 
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When the MPC601 detects SPR encodings other than those defined in this document, it 
either takes a program exception (if bit of the SPR encoding is set) or it treats the 
instruction as a no-op (if bit of the SPR encoding is clear). 

2.2.5.1 MQ Register (MQ) 

The MQ register (MQ), shown in Figure 2-6, is a 32-bit register used as a register extension 
to accommodate the product for the multiply instruction and the dividend for the divide 
instruction. It is also used as an operand of long rotate and shift instructions. The MQ 
register is implemented on the MPC601. 



MQ 



31 



Figure 2-6. MQ Register (IVIQ) 



The MQ register is not defined in the PowerPC architecture. However, in the MPC6()1, it 
may be modified during the execution of the mulll, mullw, mulhs, tnulhu, divw, and 
divwu instructions, which are PowerPC instrutions. 

The value written to the MQ register during these operations is operand-dependent and 
therefore, the MQ contents become undefined after any of these instructions executes. In 
addition, the MQ is modified by the implementation-specific instructions supported by the 
MPC601 that are not part of the PowerPC architecture. These are listed in Table 2-7. 

Table 2-7. I\1PC601 -Specific Instructions that Modify the MQ Register 



Mnemonic 


Instruction Name 


Read/Write 


mul 


Multiply 


Read/write 


div 


Divide 


Read/write 


divs 


Divide Short 


Read/write 


sliq 


Shift Left Immediate with MQ 


Read/write 


slliq 


Shift Left Long Immediate with MQ 


Read/write 


sle 


Shift Left Extended 


Write 


sleq 


Shift Left Extended with MQ 


Read/write 


slliq 


Shift Left Long Immediate with MQ 


Read/write 


sllq 


Shift Left Long with MQ 


Read/write 


slq 


Shift Left with MQ 


Write 


sraiq 


Shift Right Algebraic Immediate with MQ 


Write 


sraq 


Shift Right Algebraic with MQ 


Write 


sre 


Shift Right Extended 


Write 


srea 


Shift Right Extended Algebraic 


Write 


sreq 


Shift Right Extended with MQ 


Read/write 
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Table 2-7. MPC601 -Specific Instructions that Modify the MQ Register (Continued) 



Mnemonic 


Instruction Name 


Read/Write 


sriq 


Shift Right Immediate with MQ 


Write 


sriiq 


Shift Right Long Immediate with MQ 


Read/write 


srIq 


Shift Right Long with MQ 


Read/write 


srq 


Shift Right with MQ 


Write 



The PowerPC instructions listed in Table 2-8 use the MQ register but leave it in an 
undefined state. 

Table 2-8. PowerPC Instructions that Use the MQ Register 



Mnemonic 


Instruction Name 


muili 


Multiply Low Immediate 


muilw 


Multiply Low 


mulhw 


Multiply High Word 


mulhwu 


Multiply High Word Unsigned 


divw 


Divide Word 


divwu 


Divide Word Unsigned 



The Move to Special Purpose Register (mtspr) and Move from Special Purpose Register 
(mfspr) can access the MQ register. The SPR number for the MQ register is 0. 

The MQ register is not part of the PowerPC architecture and will not be supported in all 
PowerPC microprocessors. The MQ register is cleared by hard reset. 

2.2.5.2 Integer Exception Register (XER) 

The integer exception register (XER) is a user-level, 32-bit register shown in Figure 2-7. 

m Reserved 



so 


ov 


CA 


odoooooo&ootjo 


Byte compare value 


ill 


Byte count 



12 3 



15 16 



23 24 25 



31 



Figure 2-7. Integer Exception Register (XER) 

The SPR number for the XER is 1. The bit definitions for XER, shown in Table 2-9, are 
based on the operation of an instruction considered as a whole, not on intermediate results. 
For example, the result of the Subtract from Carrying (subfcx) instruction is specified as 
the sum of three values. This instruction sets bits in the XER based on the entire operation, 
not on an intermediate sum. 

The XER is cleared by hard reset. 
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Table 2-9. Integer Exception Register Bit Definitions 



Bit(s) 


Name 


Description 





SO 


Summary Overflow (SO) — ^The summary overflow bit (OV) is set whenever an instruction sets the 
overflow bit (OV) to indicate overflow and remains set until software clears It. It is not altered by 
compare instructions or other instructions that cannot overflow. 


1 


ov 


Overflow (OV)— The overflow bit is set to indicate that an overflow has occurred during execution 
of an instruction. Integer and subtract instructions having 0E=1 set OV if the carry out of bit is 
not equal to the carry out of bit 1 , and clear it otherwise. The OV bit is not altered by compare 
instructions or other instructions that cannot overflow. 


2 


CA 


Carry (CA) — In general, the carry bit is set to indicate that a carry out of bit occurred during 
execution of an instruction. Add carrying, subtract from carrying, add extended, and subtract from 
extended instructions set CAto one if there is a carry out of bit 0, and clear it otherwise. The CA 
bit is not altered by compare instructions, or other instructions that cannot carry, except that shift 
right algebraic instructions set the CA bit to indicate whether any '1 ' bits have been shifted out of a 
negative quantity. 


3-15 


— 


Reserved 


16-23 




This field contains the byte to be compared by a Load String and Compare Byte Indexed (Iscbx) 
instruction. 


24 


— 


Reserved 


25-31 




This field specifies the number of bytes to be transferred by a Load String Word Indexed (Iswx), 
Store String Word Indexed (stswx) or Load String and Compare Byte Indexed (Iscbx) instruction. 



2.2.5.3 Real-Time Clock (RTC) Registers 

The real-time clock (RTC) registers provide a high-resolution measure of real time for 
indicating the date and time of day. The RTC facility provides a calendar range of roughly 
135 years. The RTC registers are not implemented on other PowerPC processors; instead, 
other PowerPC processors use a time base which is a subdivision of the processor clock. 

The RTC input is sampled using the CPU clock. Therefore, if the CPU clock is less than 
twice the RTC frequency, real-time clock (and decrementer) sampling and incrementing 
errors will occur. Therefore, in systems that change the CPU clock frequency dynamically 
beyond this limit, a method of saving and restoring the real-time clock register values via 
external means is required. 

The RTC registers, shown in Figure 2-8, consist of the following: 

• Real-time clock upper (RTCU) — This register specifies the number of seconds that 
have elapsed since the time specified in the software. 

• Real-time clock lower (RTCL) — ^This register contains the number of nanoseconds 
since the beginning of the current second. 

Together, RTCU and RTCL provide a high-resolution measurement of real time. 

Reading any portion of the RTC registers does not affect its contents. The writing of the 
RTCU and RTCL registers is allowed for supervisor programs only (mtspr is supervisor- 
only for RTC registers) 
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m Reserved 
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RTCL 


0000000 
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(2) 

Figure 2-8. Real-Time Clock (RTC) Registers 

The RTC runs constantly while power is applied. Note that the RTC will not be 
implemented in other PowerPC processors. The condition register is cleared by hard reset. 
Note however, that if an external clock is connected to the RTC, the RTCL and RTCU 
registers can change from their initial values without receiving instructions to load those 
registers. 

2.2.5.3.1 Real-Time Clock Lower (RTCL) Register 

The RTCL functions as a 23-bit counter that provides the lower word of the RTC. As an 
indicator of the granularity of the RTC, enough bits are implemented to provide a resolution 
that is finer than the time required to execute 10 Add Immediate (addi) instructions. The 
following details describe the RTCL: 

• Bits 0-\ and bits 25-31 are not implemented. (The number of lower order bits 
required is determined by the frequency of the oscillator — ^7.8125 MHz) 

• The least significant implemented bit of the RTCL (bit 24) is incremented every 1 28 
nS. 

• The period of the RTCL is one billion nanoseconds (one second). 

• Unless it is altered by software, the RTCL reaches its terminal count value of 
999,999,872 (one billion minus 128) after 999,999,999 nS. The next time RTCL is 
incremented, it cycles to all zeros and RTCU is incremented. 

• Using the mfspr instruction with RTCL does not affect its contents. Unimplemented 
bits are read as zeros. 

• If the mtspr instruction is used to replace the contents of the RTCL with the contents 
of a GPR, the values of the GPR corresponding to the unimplemented bits in the 
RTCL are ignored. 

2.2.5.3.2 Real-Time Clock Upper (RTCU) Register 

The RTCU register is a 32-bit binary counter in which the least-significant bit is 
incremented in synchronization with the transition to zero of the RTCL counter (after one- 
billion nanoseconds — that is every second). All 32 bits of the RTCU are implemented. 
When the RTCU is set to all ones, the next time it is incremented it becomes all zeros. 
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When the contents of the RTCU or the RTCL are copied to a GPR, bits in the GPR 
corresponding to the unimplemented bits in the RTCL are cleared. 

2.2.5.3.3 Reading the RTC 

The contents of either RTC register can be copied into a GPR by user prograins with the 
mfspr instruction. Because the RTCL continues to increment and the RTCU may be 
incremented while instructions are being executed that read the two RTC registers, when 
the current time is required in a form that includes more than the upper or lower word of 
the RTC, the following procedure should be used: 

1 . Execute the following instruction sequence: 

mfspr rA,r4 
mfspr rB,r5 
mfspr rC,r4 

2. If(rC) = (rA) 

then the correct value has been obtained 
else repeat step 1 

Step 2 is required because the RTC continues to increment and the RTCU may increment 
while the instructions that read the two halves of the RTC are being executed. If the values 
in rC and r A match, the RTCU has not been incremented, and the RTCU value can be used 
along with the value in rB as the current RTC value. However, if the values of rC and rA 
differ, the RTCU has been incremented and it cannot be guaranteed which, if either, RTCU 
value should be associated with the value in rB. 

Successive readings of the RTC registers do not necessarily give unique values. If unique 
values are required, and the RTCL being updated at least once every ten add immediate 
instruction times is insufficient to ensure unique values, a software solution is required. 

2.2.5.3.4 RTC Synchronization in a Multiprocessor System 

Typically, RTCs must be synchronized in a multiprocessor system. 

One way to achieve synchronization is to use a gated RTC clock as the input to all 
MPC6(.)ls in a system. The gate clock can be enabled and disabled through the use of an 
I/O access (either I/O controller interface store instruction to a selected BUID, or a 
memory-mapped I/O access). This allows the RTC input clock to all processors to be turned 
on and off at the same time. Each processor's RTC register can then be loaded to the same 
value before starting the RTC input clock. 

2.2.5.4 Link Register (LR) 

The 32-bit LR supplies the branch target address for the Branch Conditional to Link 
Register (bclrx) instruction, and can be used to hold the logical address of the instruction 
that follows a branch and link instruction. The format of LR is shown in Figure 2-9. 
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Branch Address 



31 

Figure 2-9. Link Register (LR) 

Note that although the two least-significant bits can accept any values written to them, they 
are ignored when the LR is used as an address. The link register can be accessed by the 
mtspr and mfspr instructions using the SPR number 8 (the instruction encoding juxtaposes 
the lO-bit binary representation, b'01()()0 0(){)00'). Prefetching instructions along the target 
path (loaded by an mtspr instruction) is possible provided the link register is loaded 
sufficiently ahead of the branch instruction. It is usually possible to prefetch along a target 
path loaded by a branch and link instruction. 

Branching can be conditional or unconditional, and the return address can optionally be 
provided. If the return address is to be provided, the effective address of the instruction 
following the branch instruction is placed into the LR after the branch target address has 
been computed — this is done regardless of whether the branch is taken. 

As a performance optimization, and as an aid for handling the precise exception model, the 
MPC601 implements a two-entry link register shadow. Shadowing allows the link register 
to be updated by branch instructions that are executed out-of-order with respect to integer 
instructions without destroying machine state information if any integer instructions takes 
a precise exception. The link register is cleared by hard reset. 

2.2.5.5 Count Register (CTR) 

The count register (CTR) is a 32-bit register for holding a loop count that can be 
decremented during execution of branch instructions that contain an appropriately coded 
BO field. If the value in CTR is before being decremented, it is -1 afterward. The count 
register provides the branch target address for the Branch Conditional to Count Register 
(bcctrx) instruction. The CTR is shown in Figure 2-10. 



CTR 



31 

Figure 2-10. Count Register (CTR) 

Prefetching instructions along the target path is also possible provided the count register is 
loaded sufficiently ahead of the branch instruction. 

The count register can be accessed by the mtspr and mfspr instructions by specifying the 
SPR9. In branch conditional instructions, the BO field specifies the conditions under which 
the branch is taken. The first four bits of the BO field specify how the branch is affected by 
or affects the condition register and the count register. The encoding for the BO field is 
shown in Table 3-25. The counter register is cleared by hard reset. 
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2.3 Supervisor-Level Registers 



There are registers in the MPC601 that can be accessed only by supervisor-level software. 
These include the machine state register (MSR), the segment registers, and a number of 
SPRs. 

2.3.1 Machine State Register (IVISR) 

The MSR, shown in Figure 2-1 1, is a 32-bit register that defines the state of the processor. 
When an exception occurs, MSR bits are altered in accordance with Table 2-10. The MSR 
can also be modified by the mtmsr, so, and rfi instructions. It can be read by the mfmsr 
instruction. Note that in other PowerPC processors, the MSR is a 64-bit register. 

Note that the MPC601 does not implement the branch trace enable bit — BE (bit 22) or the 
recoverable exception bit — RE (bit 30). The state of these bits does not affect the operation 
oftheMPC601. 

m Reserved 



000000000000000 



EE PR FP ME FEO SE FE1 EP IT DT 00 



15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

Figure 2-11. Machine State Register (MSR) 

Table 2-10 shows the bit definitions for the MSR. 

Table 2-10. Machine State Register Bit Settings 



Bit(s) 


Name 


Description 


0-15 




Reserved 


16 


EE 


External interrupt enable 

The processor delays recognition of external interrupts and decrementer exception 
conditions. 

1 The processor is enabled to take an external interrupt or the decrementer exception. 


17 


PR 


Privilege level 

The processor can execute both user- and supervisor-level instructions. 

1 The processor can only execute user-level instructions. 


18 


FP 


Floating-point available 

The processor prevents dispatch of floating-point instructions, including floating-point 
loads, stores and moves. Floating-point enabled program exceptions can still occur and 
the FPRs can still be accessed. 

1 The processor can execute floating-point instructions, and can take floating-point 
enabled exception type program exceptions. 


19 


ME 


Machine check enable 

Machine check exceptions are disabled. 

1 Machine check exceptions are enabled. 


20 


FEO 


Floating-point exception mode (See Table 2-11.) 



2-20 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 



Table 2-10. Machine State Register Bit Settings (Continued) 



Bit(s) 


Name 


Description 


21 


SE 


Single-step trace enable 

The processor executes instructions normally. 

1 The processor generates a single-step trace exception upon the successful execution of 
the next instruction. When this bit is set, the processor dispatches instructions in strict 
program order. Successful execution means the instruction caused no other exception. 
Single-step tracing may not be present on all implementations. 


22 


— 


Reserved * on the MPC601 . 


23 


FE1 


Floating-point exception mode 1 (See Table 2-11 .) 


24 


— 


Reserved. This bit corresponds to the AL bit of the POWER architecture. 


25 


EP 


Exception prefix. The setting of this bit specifies whether an exception vector offset is 
prepended with Fs or Os. In the following description, nnnnn is the offset of the exception. See 
Table 5-7. 

Exceptions are vectored to the physical address \'OOQn_nnnn'. 

1 Exceptions are vectored to the physical address x'FFFn_nnnn'. 


26 


IT 


Instruction address translation 

Instruction address translation is disabled. When instruction translation is off, EA is 
interpreted as described in Chapter 6, "f^/lemory Management Unit." 

1 Instruction address translation is enabled. 


27 


DT 


Data address translation 

Data address translation is disabled. When data translation is disabled, EA is interpreted 
as described in Chapter 6, "tvlemory Management Unit." 

1 Data address translation is enabled. 


28-29 


— 


Reserved 


30 


— 


Reserved* on the MPC601 . 


31 


— 


Reserved * on the MPC601 . 



■ *These reserved bits may be used by other PowerPC processors. Attempting to change these bits does not 
affect the operation of the MPC601 . These bit positions always return a zero value when read. 

The floating-point exception mode bits are interpreted as shown in Table 2-11. For further 
details see Section 5.4.7.1, "Floating-Point Enabled Program Exceptions." Note that these 
bits are logically ORed, so that if either is set the processor operates in precise mode. 

Table 2-11. Floating-Point Exception Mode Bits 



FEO 


FE1 


Mode 








Floating-point exceptions disabled 





1 


Floating-point imprecise nonrecoverable* 


1 





Floating-point imprecise recoverable* 


1 


1 


Floating-point precise mode 



'Because FEO and FE1 are logically ORed on the 
MPC601 , neither of these modes is available. If 
either bit is set, the processor operates in precise 
mode. 
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Table 2-12 indicates the state of the MSR after a hard reset: 

Table 2-12. State of MSR at Power Up 



Bit 


Description 


0-15 


(Reserved) 


16-18 





19 


1 


20-23 





24 





25 


1 


26-27 





28-30 


(Reserved) 


31 






2.3.2 Segment Registers 

The sixteen 32-bit segment registers are present only in 32-bit PowerPC implementations. 
Figure 2-12 shows the format of a segment register in the MPC601 . Note that the fields in 
the segment register are interpreted differently depending on the value of bit (the T bit). 

pi] Reserved 



T 


Ks 


Ku 


Reserved/BUID 


VSID/controller specific information 



12 3 



7 8 



31 



Figure 2-12. Segment Register Format (T = 0) 

Segment registers can be accessed by using the mtsr and mtsrln instructions. Segment 
register bit settings when T = are described in Table 2-13. 

Table 2-13. Segment Register Bit Settings (T = 0) 



Bits 


Name 


Description 





T 


T = selects this format 


1 


Ks 


Supervisor-state memory key 


2 


Ku 


User-state protection l<ey 


3-7 


— 


Reserved 


8-31 


VSID 


Virtual segment ID 



Figure 2-13 shows the bit definition when T = 1, 
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T 


Ks 


Ku 


BUID 


Controller Specific 



1 2 3 



11 12 



31 



Figure 2-13. Segment Register Format (T=1) 

The bits in the segment register when T = 1 are described in Table 2-14, 
Table 2-14. Segment Register Bit Settings (T = 1) 



Bits 


Name 


Description 





T 


T = 1 selects this format 


1 


Ks 


Supervisor-state memory key 


2 


Ku 


User-state protection l<ey 


3-11 


BUID 


Bus unit ID 


12-31 


— 


Device specific data for I/O controller 



If T=0 in the selected segment register, the effective address is a reference to an ordinary 
memory segment. For memory segments the segmented address translation mechanism 
may be superseded by the block address translation (BAT) mechanism. If not, the 52-bit 
virtual address (VA) is formed by concatenating the following: 

• The 24-bit VSID field from the segment register 

• The 1 6-bit page index, E A[4-l 9] 

• The 1 2-bit byte offset, EA[20-3 1 ] 

The VA is then translated to a physical address as described in Section 6.8, "Memory 
Segment Model." 

If T=l in the selected segment register, the effective address is a reference to an I/O 
controller interface segment. No reference is made to the page tables; address translation 
continues as described in Section 6.10, "I/O Controller Interface Address Translation." 

The segment registers are cleared by hard reset. 

2.3.3 Supervisor-Level SPRs 

Many of the SPRs can be accessed only by supervisor-level instructions. Some SPRs are 
implementation- specific; some SPRs in the MPC601 may not be implemented in other 
PowerPC processors, or may not be implemented in the same way. Table 2-15 summarizes 
how the MPC601 treats the undefined bits in supervisor-level SPRs. 

Some SPR bits are reserved, and should not be used. Some of these bits are used in other 
PowerPC processors. 
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Table 2-15. Undefined Bits in Supervisor-Level SPRs 



Register 


Value Returned for Undefined 
Bits 


MSR 


Zero 


FPSCR 


Zero 


SDR1 


Zero 


All BATs 


Value last written to that bit position 


HIDO 


Zero 


HID1 


Value last written to that bit position 


H!D2 


Value last written to that bit position 


HID15 


Zero 



When the MPC601 detects SPR encodings other than those defined in this document, it 
either takes a program exception (if bit of the SPR encoding is set) or it treats the 
instruction as a no-op (if bit of the SPR encoding is clear). 

2.3.3.1 Synchronization for Supervisor-Level SPRs, and Segment 
Registers 

The processor has synchronization requirements when updating the following MMU 
registers when the corresponding address translation is enabled (data accesses with 
MSR[DT]=1 or instruction fetches with MSR[IR]=1): 

• SDRl 

• BATs (if MSR[DT]=1 or MSR[IT]=1 ) 

• Segment registers 

In addition, there are other software requirements that should be observed when modifying 
these MMU registers and the MSR[IT] bit. 

2.3.3.1.1 Context Synchronization 

The processor checks for read and write dependencies with respect to segment registers and 
special purpose registers and executes series of instructions involving those registers so that 
dependencies are not violated. For example, if an mtspr instruction writes a value to a 
particular SPR and an mfspr instruction later in the instruction stream reads the same SPR, 
the mfspr reads the value written by the mtspr. 
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It is important to note that dependencies caused by side effects of writing to segment 
registers and SPRs are not checiced automatically. If an mtspr instruction writes a value to 
an SPR that changes how address translation is performed, a subsequent load instruction 
cannot use the new translation until the CPU is expHcitly synchronized by using one of the 
following context-synchronizing operations: 

• Isync instruction 

• sc instruction 

• rfi instruction 

• Any exception, other than machine check and system reset 

Note that the sync instruction, although not defined as context-synchronizing in the 
PowerPC architecture, can sometimes be used to provide the required synchronization. The 
MPC601 processor automatically provides all synchronization required for updates to the 
CR, CTR, LR, MSR, FPSCR, and XER registers in all cases. 

In general, context-synchronizing operations are required when writes to the MMU 
registers are preceded or followed by load or store instructions. 

Specifically, a context-synchronizing operation or a sync instruction must precede a 
modification of the BAT or segment registers when the corresponding address translations 
are enabled (data accesses with MSR[DT]=1 or instruction fetches with MSR[IR]=1). In 
the case of the SDRl, a sync instruction must precede the modification of the SDRl when 
the corresponding address translations are enabled (data accesses with MSR[DT]=1 or 
instruction fetches with MSR[IR]=1), guaranteeing that the reference and change bits are 
updated in the correct context. 

If the corresponding address translations are enabled (data accesses with MSR[DT]=1 or 
instruction fetches with MSR[IR]=1), a context synchronization operation must follow the 
modification of any of the above registers. 

When several of the registers listed above are modified with no intervening instructions that 
are affected by the changes, no context synchronization or sync instructions are required 
between the alterations. However, instructions fetched and/or executed after the alteration 
but before the context synchronizing operation may be fetched and/or executed in either the 
context that existed before the alteration or the context established by the alteration. 

For synchronization within a sequence of instructions, the isync instruction can be used as 
shown in example 1 : 

Example 1: Using the isync instruction — In this example a single segment register (n) 
needs to be updated in a context where loads and stores might otherwise execute ahead of 
the mtsr instruction and use the outdated address translation. Data and instruction address 
ti-anslation is enabled (MSR[DT] = 1 and MSR[IT] = 1): 

isync 
mtsr sr,rrt 
isync 
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The first isync instruction allows all instructions in the pipeline to complete, allowing the 
mtsr instruction to dispatch and execute by itself. 

Example 2: Using the isync instruction with a series of register modifications — In 

example 1, the single mtsr instruction could safely be replaced with a series of mtsr 
instructions without each requiring a isync instruction. However, if both mtsr and mfsr 
instructions are needed, they should be separated by an isync instruction, as follows: 

isync 
mtsr sr,rO 
mtsr sr,rl 

mtsr sr,r7 
isync 

mfsr r8,sr 
mfsr r9,sr 

mfsr rl5,sr 
isync 

Example 3: Using the rfi instruction — When several registers are updated with no 
intervening loads or stores with MSR[DT]=1 or instruction fetches with MSR[IT]=1, 
context-synchronization between updates is unnecessary. For example, when an exception 
is taken, the processor is synchronized automatically. In this example, a list of segment 
registers is updated with several mtsr instructions followed by a single context- 
synchronizing operation. 

Because this example modifies all 16 segment registers (and therefore, affects the segment 
register(s) that control instruction fetching, this particular sequence must be executed in 
direct address translation mode (MSR[IT] = 0). Therefore, no synchronization is required 
before the segment registers are loaded. Even if the segment register(s) that control 
instruction fetching is not to be reloaded, the sequence can be executed with instruction 
address translation enabled (MSR[IT] = 1) and no additional synchronization before the 
segment register instructions. 

In this example the rfi instruction provides the needed synchronization after all 16 segment 
registers are loaded and before translated loads and stores are executed. 

mtsr sr,rO 
mtsr sr,rl 

mtsr sr,rl5 

<load rest of machine state> 

rfi 
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2.3.3.1.2 Other Requirements by Register 

SDRl and MSR— The SDRl register should be modified only when MSR[IT] = 0. In 
addition, the MSR[IT] bit should be altered only by software that is has an address mapping 
such that logical addresses directly map to physical addresses. 

Segment Registers — ^The only fields that should be modified in the segment register that 
is currently in use for instruction fetching are the Ks and Kp bits. Note that any time the 
segment registers are updated, the changes are guaranteed to take affect (including changes 
of the Kjc bits) only after a context-synchronizing operation has occurred. 

BAT Registers — ^The only fields that should be modified in the BAT register that is 
currently in use for instrucfion fetching are the Ks, Kp and the V (valid) bits. In the case of 
modifying the V bit for the BAT register currently in use for instruction accesses, the 
instructions immediately following the mtspr for the BAT register must also be mapped by 
the page address translation mechanism with the same logical to physical address mapping 
(or alternately, the instructions must be duplicated in the newly mapped space). Note that 
any time the BAT registers are updated, the changes are guaranteed to take affect (including 
changes of the Kjc bits) only after a context-synchronizing operation has occurred. 

In order to make a BAT register pair valid in a manner such that the BTLB entry then 
translates the current instruction stream, the following sequence should be used if fields in 
both the upper and lower BAT registers are to be modified (for instruction address 
translation): 

1 . The V bit in the BAT register pair should be cleared to 0. 

2. The other fields in the BAT register pair should be initialized appropriately. 

3. The V bit in the BAT register pair should be set to 1 . 

4. A context-synchronizing operation should be performed 

2.3.3.2 DAE/Source Instruction Service Register (DSISR) 

The 32-bit DSISR, shown in Figure 2-14, identifies the cause of data access and alignment 
exceptions. 



DSISR 



31 

Figure 2-14. DAE/Source Instruction Service Register (DSISR) 

For information about bit settings, see Section 5.4.3, "Data Access Exception (x'0()300'),' 
and Section 5.4.6, "Afignment Exception (x'006(X)')." 

The DSISR is cleared after a hard reset. 

2.3.3.3 Data Address Register (DAR) 

The DAR is a 32-bit register shown in Figure 2-15. 
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Figure 2-15. Data Address Register (DAR) 

After a data access, I/O controller interface error, or alignment exception, DAR is set to the 
effective address of a load or store element. For information, see Section 5.4.3, "Data 
Access Exception (x'(){)300')," and Section 5.4.6, "Alignment Exception (x'0()6(X)')." 

2.3.3.4 Decrementer (DEC) Register 

The DEC, shown in Figure 2-16, is a 32-bit decrementing counter that provides a 
mechanism for causing a decrementer exception after a programmable delay. On the 
MPC601, the DEC is driven by the same frequency as the RTC (7.8125 MHz). On other 
PowerPC processors, the DEC frequency is based on a subdivision of the processor clock. 
The DEC is cleared by hard reset. Note that if an external clock is connected to the RTC, 
the DEC can change from its original value of zeros without receiving an instruction to load 
the register. 



DEC 
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Figure 2-16. Decrementer Register (DEC) 

2.3.3.4.1 Decrementer Operation 

The DEC counts down, causing an exception (unless masked) when it passes through zero. 
The DEC satisfies the following requirements: 

• The operation of the RTC and the DEC are coherent (that is, the counters are driven 
by the same fundamental time base). 

• Loading a GPR from the DEC has no effect on the DEC. 

• Storing a GPR to the DEC replaces the value in the DEC with the value in the GPR. 

• Whenever bit of the DEC changes from to 1 , a decrementer exception request is 
signaled. (The exception breaks the pipeline in such a way that instructions in the 
execute state (except for instructions that have been dispatched ahead of 
undispatched integer instructions) complete execution, and instructions in decode 
stage remain undecoded until the exception handler returns control to the interrupted 
program.) Multiple DEC exception requests may be received before the first 
exception occurs; however, any additional requests are canceled when the exception 
occurs for the first request. 

• If the DEC is altered by software and the content of bit is changed from to 1 , an 
exception request is signaled. 
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The RTC input is sampled using the CPU clock. Therefore, if the CPU clocic is less than 
twice the RTC frequency, real-time clock (and decrementer) sampling and incrementing 
errors will occur. Therefore, in systems that change the CPU clock frequency dynamically 
beyond this limit, a method of saving and restoring the real-time clock register values via 
external means is required. 

2.3.3.4.2 Writing and Reading the DEC 

The content of the DEC can be read or written using the mfspr and mtspr instructions, both 
of which are supervisor-level when they refer to the DEC. However, the MPC601 also 
allows the reading of the DEC in user mode (for POWER compatibility) via the SPR6 
register. Using a simplified mnemonic for the mtspr instruction, the DEC may be written 
from GPR r A with the following: 

mtdec rA 

If the execution of this instruction causes bit of the DEC to change from to 1 , an 
exception request is signaled. The DEC may be read into GPR rA with the following 
sequence: 

mfdec rA 

Copying the DEC to a GPR does not affect the DEC content or the exception mechanism. 

2.3.3.5 Table Search Descriptor Register 1 (SDR1) 

The table search descriptor register 1 (SDRl) is shown in Figure 2-17. 

[ill Reserved 



HTABORG 


OO0OOOO 


HTABMASK 



15 16 22 23 

Figure 2-17. Table Search Descriptor Register 1 (SDR1) 

The bits of the SDRl are described in Table 2-16. 

Table 2-16. Table Search Descriptor Register 1 (SDRl) Bit Settings 
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Bits 


Name 


Description 


0-15 


HTABORG 


The high-order 1 6 bits of the 32-bit physical address of the page table 


16-22 


— 


Reserved 


23-31 


HTABMASK 


Mask for page table address 



The HTABORG field in SDRl contains the high-order 1 6 bits of the 32-bit physical address 
of the page table. Therefore, the page table is constrained to He on a 2^*^ byte (64 Kbytes) 
boundary at a minimum. At least 10 bits from the hash function are used to index into the 
page table. The page table must consist of at least 64 Kbytes 2^° PTEGs of 64 bytes each. 
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The page table can be any size 2" where 1 6 < n < 25. As the table size is increased, more 
bits are used from the hash to index into the table and the value in HTABORG must have 
more of its low-order bits equal to 0. The HTABMASK field in SDRl contains a mask 
value that determines how many bits from the hash are used in the page table index. This 
mask must be of the form b'(){)...01 1 ...1 '; that is, a string of bits followed by a string of 1 
bits. The 1 bits determine how many additional bits (at least 10) from the hash are used in 
the index; HTABORG must have this same number of low-order bits equal to 0. See 
Figure 6-21 . 

The number of low-order bits in HTABORG must be at least the number of 1 bits in 
HTABMASK so that the final 32-bit physical address can be formed by logically ORing 
the various components. 

2.3.3.6 Machine Status Save/Restore Register (SRRO) 

The machine status save/restore register (SRRO) is a 32-bit register the MPC601 uses to 
save machine status on exceptions and restore machine status when an rfi instruction is 
executed. It also holds the EA for the instruction that follows the System Call (sc) 
instruction. The SRRO is shown in Figure 2-18. 



SRRO 
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Figure 2-18. Save/Restore Register (SRRO) 

When an exception occurs, SRRO is set to point to an instruction such that all prior 
instructions have completed execution and no subsequent instruction has begun execution. 
The instruction addressed by SRRO may not have completed execution, depending on the 
exception type. SRRO addresses either the instruction causing the exception or the 
immediately following instruction. The instruction addressed can be determined from the 
exception type and status bits. 

The SRRO is cleared by hard reset. 

For information on how specific exceptions affect SRRO, refer to the descriptions of 
individual exceptions in Chapter 5, "Exceptions." 

2.3.3.7 IVIacliine Status Save/Restore Register 1 (SRR1) 

The SRRl is a 32-bit register used to save machine status on exceptions and to restore 
machine status when an rfi instruction is executed. The SRRl is shown in Figure 2-19. 



^ 



SRRl 



15 16 31 

Figure 2-19. l\/lachine Status Save/Restore Register 1 (SRRl) 
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In general, when an exception occurs, bits 0-15 of SRRl are loaded with exception-specific 
information and bits 16-31 of MSR are placed into bits 16-31 of SRRl. 

The SRRl is cleared by hard reset. 

For information on how specific exceptions affect SRRl, refer to the individual exceptions 
in Chapter 5, "Exceptions." 

2.3.3.8 General SPRs (SPRG0-SPRG3) 

SPRGO through SPRG3 are 32-bit registers provided for general operating system use, such 
as performing a fast state save and for supporting multiprocessor implementations. 
SPRG()-SPRG3 are shown in Figure 2-20. 



SPRGO 



SPRG1 



SPRG2 



SPRG3 
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Figure 2-20. General SPRs (SPRG0-SPRG3) 

Uses for SPRG0-SPRG3 are shown in Table 2-17. 

Table 2-17. Uses of SPRG0-SPRG3 



Register 


Description 


SPRGO 


Software may load a unique physical address in this register to identify an area of memory reserved for 
use by the exception handler. This area must be unique for each processor in the system. 


SPRG1 


This register may be used as a scratch register by the exception handler to save the content of a GPR. 
That GPR then can be loaded from SPRGO and used as a base register to save other GPRs to memory. 


SPRG2 


This register may be used by the operating system as needed. 


SPRG3 


This register may be used by the operating system as needed. 



2.3.3.9 External Access Register (EAR) 

The EAR is a 32-bit SPR that controls access to the external control facility and identifies 
the target device for external control operations. The external control facility provides a 
means for user-level instructions to communicate with special external devices. The EAR 
is shown in Figure 2-21 . 
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Figure 2-21. External Access Register (EAR) 

This register is provided to support the External Control Input Word Indexed (eclwx) and 
External Control Output Word Indexed (ecowx) instructions, which are described in 
Chapter 10, "Instruction Set." Although access to the EAR is privileged, the operating 
system can determine which tasks are allowed to issue external access instructions and 
when they are allowed to do so. The bit settings for the EAR are described in Table 2-18. 
Interpretation of the physical address transmitted by the eciwx and ecowx instructions and 
the 32-bit value transmitted by the ecowx instruction is not prescribed by the PowerPC 
architecture but is determined by the target device. 

For example, if the external control facility is used to support a graphics adapter, the ecowx 
instruction could be used to send the translated physical address of a buffer containing 
graphics data to the graphics device. The ecowx instruction could be used to load status 
information from the graphics adapter. 

Table 2-18. External Access Register (EAR) Bit Settings 



Bit 


Name 


Description 





E 


Enable bit 

1 Enabled 

Disabled 

If ttiis bit is set, the eciwx and ecowx instructions can perform the 

specified external operation. If the bit is cleared, an eciwx or ecowx 

instruction causes a data access exception. 


1-27 


— 


Reserved 


28-31 


RID 


Resource ID. The RID is formed by concatenating TBST||TSIZO- 
TSIZ2. Note that in other PowerPC implementations, this field may 
use bits 26-31 . 



This register can also be accessed by using the mtspr and mfspr instructions using the 
value 282, b'OlOOO 11010'. When reading from the EAR, the following sequence should 
be used: 

sync 

mfspr rD,282 

sync 

The EAR is cleared by hard reset. 
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2.3.3.10 Processor Version Register (PVR) 

The PVR is a 32-bit, read-only register that identifies the version and revision level of the 
PowerPC processor (see Figure 2-22). The contents of the PVR can be copied to a GPR by 
the mfspr instruction. Read access to the PVR is available in supervisor mode only; write 
access is not provided. 



Version 


Revision 



15 16 31 

Figure 2-22. Processor Version Register (PVR) 

The PVR consists of two 16-bit fields: 

• Version (bits (>-15) — A 16-bit number that identifies the version of the processor 
and of the PowerPC architecture. 

— The processor version number is x'0(X)r for the MPC601. 

— Other processor numbers assigned as of the initial release of the MPC601 are as 
follows 

x'0003' 

x'0004' 

x'0014' 

• Revision (bits 1 6-3 1 ) — A 1 6-bit number that distinguishes between various releases 
of a particular version, (that is, an engineering change level). The value of the 
revision portion of the PVR is implementation-specific. 

— The initial processor revision level is x'OO(X)' and will be changed for each 
revision of the device. 

The PVR is set to x'OOOlOOOl ' by hard reset. 
2.3.3.11 BAT Registers 

The MPC601 includes eight block-address translation (BAT) registers, consisting of four 
pairs of BATs (BAT()U-BAT3U and BAT()L-BAT3L), as shown in Figure 2-1. Note that 
this differs somewhat from other PowerPC implementations, which have two sets of four 
pairs of BAT registers. One set contains instruction BATS, or IBATs, (IBAT()U-IBAT3U 
and IBAT0L-IBAT3L), which maps to the BAT registers implemented in the MPC601 . The 
SPR numbers for these registers are listed in Figure 2-1 The additional eight registers are 
data BATs, or DBATs, (DBAT()U-DBAT3U and DBAT0L-DB.AT3L). These BATs use the 
eight SPR numbers subsequent to those used by the IBATs (536-543). 

Note that the implementation of the bit fields with in the BATs are different from the other 
PowerPC implementations. Figure 2-23 and Figure 2-24 show the format of the upper and 
lower BAT registers. 
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Figure 2-23. Upper BAT Register 
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Figure 2-24. Lower BAT Register 

Table 2-19 describes the bits in the BAT registers. 

Table 2-19. BAT Registers 



Register 


Bits 


Name 


Description 


Upper 

BAT 

Registers 


0-14 


BLPI 


Blocl< logical page index. This field is compared with bits 0-14 of the logical 
address to determine if there is a hit in that BTLB entry. 


15-24 


— 


Reserved 


25-27 


WIM 


Memory/cache access mode bits 

W Write-through 

1 Caching-inhibited 

M Memory coherence 

For detailed information about the WIM bits, see Section 6.3, "Memory/Cache 

Access Modes." 


28 


Ks 


Supervisor mode key. This bit interacts with MSR[PR] and the PP field to 
determine the protection for the block. For more information, see Section 6.4, 
"General Memory Protection Mechanism." 


29 


Ku 


User mode key. This bit also interacts with MSR[PR] and the PP field to 
determine the protection for the block. For more information, see Section 6.4, 
"General Memory Protection Mechanism." 


30-31 


PP 


Protection bits for block. This field interacts with MSR[PR] and the Ks or Ku to 
determine the protection for the block as described in Section 6.4, "General 
Memory Protection Mechanism." 


Lower 

BAT 

Registers 


0-14 


PBN 


Physical block number. This field is used in conjunction with the BSM field to 
generate bits 0-14 of the physical address of the block. 


15-24 


— 


Reserved 


25 


V 


BAT register pair (BTLB entry) is valid if V=1 


26-31 


BSM 


Block size mask (0...5). BSM is a mask that encodes the size of the block. 
Values for this field are listed in Table 2-20. 



Table 2-20 lists the BAT area lengths encoded in by BAT[BSM]. 
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Table 2-20. BAT Area Lengths 



BAT Area 
Length 


BSM Encoding 


128 Kbytes 


00 0000 


256 Kbytes 


00 0001 


512 Kbytes 


00 0011 


1 Mbyte 


00 0111 


2 Mbytes 


00 1111 


4 Mbytes 


01 1111 


8 Mbytes 


11 1111 



Only the values shown in Table 2-20 are valid for the BSM tield. The rightmost bit of BSM 
is aligned with bit 14 of the logical address. An logical address is determined to be within 
a BAT area if the logical address matches the value in the BLPI field. 

The boundary between the string of zeros and the string of ones in BSM determines the bits 
of logical address that participate in the comparison with BLPI. Bits in the logical address 
corresponding to ones in BSM are cleared for this comparison. 

Bits in the logical address corresponding to ones in the BSM field, concatenated with the 
17 bits of the logical address to the right (more significant bits) of BSM, form the offset 
within the BAT area. 

The value loaded into BSM determines both the length of the BAT area and the alignment 
of the area in both logical and physical address space. The values loaded into BLPI and 
PBN must have at least as many low-order zeros as there are ones in BSM. 

The BAT registers are cleared by hard reset. 

2.3.3.12 MPC601 Implementation-Specific HID Registers 

PowerPC processors may have implementation- specific SPRs, referred to as HID registers. 
Additional SPR encodings allow access to the implementation-dependent registers within 
the MPC601. The SPR encodings for the MPC60rs HID registers are described in 
Table 2-21 . Note that these encodings use split-field notation; that is, the order of two 5-bit 
components of the 10-bit encoding is reversed. 
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Table 2-21 . Additional SPR Encodings 



SPR Number 


SPR Encoding 
SPR(5-9)|SPR(0-4) 


Register Name 


Access 


1008 


11111 10000 


Chieckstop sources and enables register (HIDO) 


Supervisor 


1009 


11111 10001 


MPC601 debug modes register (HID1) 


Supervisor 


1010 


11111 10010 


lABR (HID2) 


Supervisor 


1013 


11111 10101 


DABR (HID5) 


Supervisor 


1023 


11111 11111 


PIR(HID15) 


Supervisor 



For additional information about the mtspr and mfspr instructions, refer to Chapter 10, 
"Instruction Set," 

2.3.3.12.1 Checkstop Sources and Enables Register— HIDO 

The checkstop sources and enables register (HIDO), shown in Figure 2-25, is a supervisor- 
level register that defines enable and monitor bits for each of the checkstop sources in the 
MPC601. The SPR number for HIDO is 1008. 
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Figure 2-25. Checkstop Sources and Enables Register (HIDO) 

Table 2-22 defines the bits in HIDO. The enable bits (bits 15-31) can be used to mask 
individual checkstop sources, although these are provided primarily to mask off any false 
reports of such conditions for debugging purposes. Bit (HIDO[CE]) is a master checkstop 
enable; if it is cleared, all checkstop conditions are disabled; if it is set, individual 
conditions can be enabled separately. HIDO[EM] (bit 16) enables and disables machine 
check checkstops; clearing this bit masks machine check checkstop conditions that occur 
when MSR[ME] is cleared. Bits 1-11 are the checkstop source bits, and can be used to 
determine the specific cause of a checkstop condition. 

All enable bits except 15 and 24 are disabled at start up. The operating system should enable 
these checkstop conditions before the power-on reset sequence is complete. 
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Table 2-22. Checkstop Sources and Enables Register (HIDO) Definition 



Bit 


Name 


Description 





CE 


Master checkstop enable. Enabled if set. 


1 


S 


Microcode ctieclotop detected if set. 


2 


M 


Double machine check detected if set. 


3 


TD 


Multiple TLB hit checkstop if set. 


4 


CD 


Multiple cache hit checkstop if set. 


5 


SH 


Sequencer time out checkstop if set. 


6 


DT 


Dispatch time out checkstop if set. 


7 


BA 


Bus address parity error if set. 


8 


BD 


Bus data parity error if set. 


9 


CP 


Cache parity error if set. 


10 


lU 


Invalid microcode instruction if set. 


11 


PP 


I/O controller interface access protocol error if set. 


12-14 


— 


Reserved 


15 


ES 


Enable microcode checkstop. Enabled by hard reset. Enabled if set. 


16 


EM 


Enable machine check checkstop. Disabled by hard reset. Enabled if set. 


17 


ETD 


Enable TLB checkstop. Disabled by hard reset. Enabled if set. 


18 


ECD 


Enable cache checkstop. Disabled by hard reset. Enabled if set. 


19 


ESH 


Enable sequencer time out checkstop. Disabled by hard reset. Enabled if set. 


20 


EDT 


Enable dispatch time out checkstop. Disabled by hard reset. Enabled if set. 


21 


EBA 


Enable bus address parity checkstop. Disabled by hard reset. Enabled if set. 


22 


EBD 


Enable bus data parity checkstop. Disabled by hard reset. Enabled if set. 


23 


ECP 


Enable cache parity checkstop. Disabled by hard reset. Enabled if set. 


24 


ElU 


Enable for invalid ucode instruction checkstop. Enabled by hard reset. Enabled if set. 


25 


EPP 


Enable for I/O controller interface access protocol checkstop. Disabled by hard reset. 
Enabled if set. 


26 


DRF 


Optional reload of alternate sector on instruction fetch miss is enabled. 

1 Optional reload of alternate sector on instruction fetch miss is disabled. 


27 


DRL 


Optional reload of alternate sector on load/store miss is enabled. 

1 Optional reload of alternate sector on load/store miss is enabled. 


28 


LM 


Big-endian mode is enabled. 

1 Little-endian mode is enabled. 

For more information about byte ordering, see Section 2.4.3, "Byte and Bit Ordering." Note 
that in the PowerPC architecture, the selection between big- and little-endian mode is 
controlled by two bits in the MSR. 


29 


PAR 


Precharge of the ARTRY and SHD signals is enabled. 

1 Precharge of the ARTRY and SHD signals is disabled. 
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Table 2-22. Checkstop Sources and Enables Register (HIDO) Definition (Continued) 



Bit 


Name 


Description 


30 


EMC 


No error detected in main cache during array initialization. 

1 Error detected in main cache during array initialization. 


31 


EHP 




The HP3NP_R^Q signal is disabled. Use of the WRS queue position is restricted to a 
snoop hit that occurs when a read is pending. That is, its address tenure is complete but 
the data tenure has not begun. 

1 The HP_SNP_REQ signal is enabled. Use of the WRS queue position is restricted to a 
snoop hit on an address tenure that had HP_SNP_REQ asserted. 



The HIDO register is set to x'80010080'by the hard reset operation. 

2.3.3.12.2 MPC601 Debug Modes Register— HID1 

The MPC601 debug modes register (HIDl) is a supervisor-level register that defines enable 
bits for the various debug modes supported by the MPC601 ; see Figure 2-26. The SPR 
number for HIDl is 1009. 
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Figure 2-26. MPC601 Debug Modes Register 

Table 2-23 shows bit settings for the HIDl register. Note that if both the single instruction 
step option is specified for the M field (b'KX)') and the trap to run mode exception option 
is specified in the RM field (b'lO'), the processor iterates in an infinite loop. 

Table 2-23. HID1 Register Definition 



Bit 


Name 


Description 





— 


Reserved 


1-3 


M 


MPC601 run modes 

000 Normal run mode 

001 Undefined. Do not use. 

010 Limited instruction address compare. 
Oil Undefined. Do not use. 

100 Single instruction step 

1 01 Undefined. Do not use. 

1 1 Full instruction address compare 

111 Full branch target address compare 


4-7 


— 


Reserved 
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Table 2-23. HID1 Register Definition (Continued) 



Bit 


Name 


Description 


8-9 


RM 


Response to address compare or single step 

00 Hard stop (Stop LI clocks.) 

01 Soft stop (Wait for system activity to quiesce.) 

10 Trap to run mode exception (address vector x'02000'), v/itii the base address 
indicated in by the setting of MSR[IP]. This mode is valid for address comparisons 
and may produce unpredictable results when used vifith HID single-instruction step 
mode. 

1 1 Reserved. Do not use. 


10-16 


— 


1 0-1 6 Reserved. Do not use. 


17-31 


MISC 


Miscellaneous latches 

17 When high, this bit disables the broadcast of the tibie instruction. 

18-31 Reserved. Do not use. 



The HIDl register is cleared by a hard reset. 

2.3.3.12.3 Instruction Address Breakpoint Register (lABR)— HID2 

The instruction address breakpoint register (lABR), is also HID2. The lABR, shown in 
Figure 2-27, is a supervisor-level register defined to hold an effective address that is used 
to compare with the logical address of the instruction in the decode phase of the pipeline. 
The results of the comparison are used differently depending on the debug mode used. 
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Figure 2-27. Instruction Address Breakpoint Register (lABR)— HID2 

Table 2-24 lists HID2 register definitions. The HID2 register is cleared by the hard reset 
operation. 

The SPR number for HID2 is 1010. 

Table 2-24. HID2 Register Definition 



Bit 


Name 


Description 


0-29 


CEA 


Comparison effective address 


30-31 


— 


Reserved. Should be set to zero. 
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2.3.3.12.4 Data Address Breakpoint Register (DABR)— HiD5 

The data address breakpoint register (DABR) (HID5), as shown in Figure 2-28, is designed 
to hold an effective address that is used to compare with the effective address of the various 
memory access instructions. The results of the comparison are used to cause a data access 
exception when the appropriate MPC6()1 debug mode bits are set (as described in 
Section 2.3.3.12.2, "MPC601 Debug Modes Register— HID 1"). 
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Figure 2-28. Data Address Breakpoint Register (DABR) 



Table 2-25 describes bit settings in HID5. The HID5 register is cleared by the hard reset 
operation. 

Table 2-25. i-IID5 Register Definition 



Bit 


Name 


Description 


0-28 


DAB 


Data address breakpoint (EA). This field is set to tlie double-word EAto compare with 
enabled load or store EAs. 


29 


— 


Reserved, although on an mfspr (DABR), the value returned is the value last written. 


30-31 


SA 


Memory access types: 

00 Breakpoints disabled 

01 Breakpoints load accesses only 

1 Breakpoints store accesses only 

1 1 Breakpoints both load and store accesses 



The SPR number for HID5 is 1013. 

If the DABR feature is enabled, operations that hit against a properly enabled DABR cause 
a data access exception. For this type of data access exception (DAE), bit 9 of the DSISR 
is set and the data address register (DAR) contains the EAthat caused the DABR match. If 
the access crossed a double-word boundary, the DAR contains the EA of the access from 
the first double word (even if the DABR match was on the second double word). For more 
information about data access exceptions, see Section 5.4.3, "Data Access Exception 
(x'(X)3{)()')." 

Table 2-26 describes how each instruction type interacts with the DABR feature. 
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Table 2-26. DABR Results 



Operation 


Description 


Load 
instructions 


If any part of the load access touches the double word specified in the DABR, and the appropriate 
enable bit is set, then the DAE occurs. In this case, the memory read operation is inhibited and 
register rS is not updated. If the operation is a load with update, the update to register rAis also 
inhiibited. 


Store 
instructions 


If any part of the store access touches the double word specified in the DABR and the appropriate 

enable bit is set, the DAE occurs and the memory access is inhibited. 

If the operation is a store with update, then the update to register rA is also inhibited. 

If the operation is a Store Conditional instruction and the reservation bit is not set at the time of the 
DABR compare (at the end of execution as soon as the EA is calculated), the DAE is not taken. 


Load and store 
string and 
multiple 
instructions 


These instructions are sequenced one register (one word) at a time through the lU for EA 

calculation. Each access is checked against the DABR as it is presented to the ATU. If a match 

occurs, the instruction is aborted and a DAE is taken. 

If the initial EA for the string or multiple is not word-aligned, some individual accesses may cross a 

double word boundary. If either double word hits in the DABR, the access is inhibited and the DAE 

occurs. 


Iscbx 

Instruction 


This instruction is not supported by the DABR Feature. No DAE occurs, even if the EA matches. 


Cache control 
instructions 


These instructions are not supported by the DABR Feature. No DAE occurs even if the EA 
matches. 



2.3.3.12.5 Processor Identification Register (PIR)— HID15 

The PIR register, shown in Figure 2-29, is a 32-bit, supervisor-level register that holds the 
4-bit processor identification tag (PID). This tag is useful for processor differentiation in 
multiprocessor system designs. The tag is also used to identify the sender and receiver tag 
for I/O controller interface operations. For more information, see Section 9.6, "I/O 
Controller Interface Operation." The PIR can be accessed by the mfspr instruction by using 
the SPR number 1023, as follows: 

sync 

mfspr rD,1023 

sync 



The PIR is cleared by the hard reset operation. 
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Figure 2-29. Processor Identification Register (PIR) 
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2.4 Operand Conventions 

This section describes the conventions used for storing values in registers and memory. 

2.4.1 Effect of Operand Placement on Performance 

The placement (location and alignment) of operands in memory affect the relative 
performance of memory accesses. The best performance is guaranteed if memory operands 
are aligned. To obtain the best performance across the widest range of PowerPC processor 
implementations, the programmer should assume the performance model described in 
Figure 2-30 with respect to the placement of memory operands. 



Operand 


Boundary Crossing 


Size 


Byte Alignment 


None 


Cache Line 


Page 


BAT/Segment 


Integer 




8 Byte 


8 
4 
<4 
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Good 
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Poor 
Poor 


Poor 
Poor 


4 Byte 


4 
<4 
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Good 


Poor 


Poor 


2 Byte 


2 
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Good 
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Poor 
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1 


Optimal 


— 


— 


— 


Imw, stmw 


4 


Good 


Good 
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Poor 


String 
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Poor 


Float 
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8 
4 
<4 
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Good 
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Poor 


Poor 
Poor 


4 Byte 


4 
<4 


optimal 
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Poor 


Poor 


Poor 



Figure 2-30. Performance Effects of Memory Operand Placement 

The performance of accesses varies depending on the following: 

Operand size 

Operand alignment 

Crossing a cache block (sector) boundary 

Crossing a page boundary 

Crossing a BAT boundary 

Crossing a segment boundary 

The load/store multiple instructions are defined to operate only on aligned operands. The 
Move Assist instructions have no alignment requirements. For the purposes of Figure 2-30, 
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crossing pages with different memory control attributes (WIM bits) is equivalent to 
crossing a segment boundary. 

2.4.1.1 Instruction Restart 

If a memory access crosses a page or segment boundary, a number of conditions could abort 
the execution of the instruction after part of the access has been performed. For example, 
this may occur when a program attempts to access a page it has not previously accessed or 
when the processor must check for a possible change in memory attributes when an access 
crosses a page boundary. When this occurs, the MPC601 or the operating system may 
restart the instruction. If the instruction is restarted, some bytes at that word address may 
be loaded from or stored to the target location a second time. 

The following rules apply to memory accesses with regard to restarting the instruction. 

• Aligned accesses — A single-register instruction that accesses an aligned operand is 
never restarted. 

• Misaligned accesses — A single-register instruction that accesses a misaligned 
operand may be restarted if the access crosses a page, BAT, or segment boundary. 

• Load/store multiple, move assist — These instructions may be restarted if, in 
accessing the locations specified by the instruction, a page, BAT, or segment 
boundary is crossed. 

2.4.1.2 Atomicity 

AH aligned accesses are atomic. Instructions causing multiple accesses (for example, 
load/store multiple and move assist instructions) are not atomic. 

2.4.1.3 Access Order 

The ordering of memory accesses is not guaranteed unless the programmer inserts the 
appropriate ordering instructions, even if the accesses are generated by a single instruction. 
Misaligned accesses, load/store multiple instructions, and move assist instructions have no 
implicit ordering characteristics. For example, processor A may store a word operand on an 
odd half-word boundary. It may appear to processor A that the store completed atomically. 
Processor or other mechanism B, executing a load from the same location, may get a result 
that is a combination of the value of the first half word that existed prior to the store by 
processor A and the value of the second half word stored by processor A. 

2.4.2 Data Organization in Memory and Data Transfers 

Bytes in memory are numbered consecutively starting with 0. Each number is the address 
of the coiTesponding byte. 

Memory operands may be bytes, half words, words, or double words, or, for the load/store 
multiple and move assist instructions, a sequence of bytes or words. The address of a 
memory operand is the address of its first byte (that is, of its lowest-numbered byte). 
Operand length is implicit for each instruction. 
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2.4.2.1 Alignment and Misaligned Accesses 

The operand of a single-register memory access instruction has a natural alignment 
boundary equal to the operand length. In other words, the "natural" address of an operand 
is an integral multiple of the operand length. A memory operand is said to be aligned if it 
is aligned at its natural boundary; otherwise it is misaligned. 

Operands for single-register memory access instructions have the characteristics shown in 
Table 2-27. (Although not permitted as memory operands, quad words are shown because 
quad-word alignment is desirable for certain memory operands.) 

Table 2-27. Memory Operands 



Operand 


Length 


Addr{28-31) 
if aligned 


Byte 


8 bits 


xxxx 


Half word 


2 bytes 


xxxO 


Word 


4 bytes 


xxOO 


Double word 


8 bytes 


xOOO 


Quad word 


16 bytes 


0000 



Note: An "x" in an address bit position indicates that the bit 
can be or 1 independent of the state of other bits in 
the address. 

The concept of alignment is also applied more generally to data in memory. For example, 
12 bytes of data are said to be word-aligned if its address is a multiple of four. 

Some instructions require their memory operands to have certain alignments. In addition, 
alignment may affect performance. For single-register memory access instructions, the best 
performance is obtained when memory operands are aligned. Additional effects of data 
placement on performance are described in Chapter 7, "Instruction Timing." 

Instructions are four bytes long and word-aligned. 

2.4.3 Byte and Bit Ordering 

The PowerPC architecture supports both big- and little-endian byte ordering. The default 
byte- and bit ordering is big-endian, as shown in Figure 2-31. Byte ordering can be set to 
little-endian by setting the LM bit in the HIDO register. Note that the mechanism for 
selecting between byte orderings is different in the MPC601 than it is in the PowerPC 
architecture. The PowerPC architecture provides two enable bits in the MSR that allow 
independent control for user- and supervisor-level software. 
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Big-Endian Bit Ordering 

Figure 2-31. Big-Endian Byte and Bit Ordering 

If scalars (individual computational data items) were indivisible, the concept of byte 
ordering would be unnecessary. Order of bits or groups of bits within the smallest 
addressable unit of memory is irrelevant, because nothing can be observed about such 
order. Order matters only when scalars, which the processor and programmer regard as 
indivisible quantities, can be made up of more than one addressable units of memory. 

For a device in which the smallest addressable unit is the 64-bit double word, there is no 
question of the order of bytes within double words. All scalar transfers between registers 
and system memory are for double words and the address of the byte containing the high- 
order eight bits of a scalar is no different from the address of a byte containing any other 
part of the scalar. 

For PowerPC processors, as for most recent processor designs, the smallest addressable 
memory unit is the byte (8 bits), and most scalars are composed of groups of bytes. When 
a 32-bit scalar is moved from a register to memory, it occupies four consecutive byte 
addresses, and a decision must be made regarding the order of these bytes in these four 
addresses. 

The choice of byte ordering is arbitrary. Although there are 24 ways (4!) to specify the 
ordering of four bytes within a word, illustrated as all the permutations of ordering of four 
elements— AfiCD, ABDC, ACBD, ACDB...DBCA, DCAB, DCBA— where A corresponds 
to the lowest address and D the highest, only two of these orderings are practical — ABCD 
(big-endian) and DCBA (little-endian). 

2.4.3.1 Big-Endian Byte Ordering 

Big-endian ordering assigns the lowest address to the highest-order eight bits of the scalar. 
This is called big-endian because the big end of the scalar, considered as a binary number, 
comes first in memory. 

2.4.3.2 Little-Endian Byte Ordering 

Little-endian byte ordering assigns the lowest address to the lowest-order (rightmost) 8 bits 
of the scalar. The little end of the scalar, considered as a binary number, comes first in 
memory. 
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2.4.4 Structure Mapping Examples 

The following C programming example contains an assortment of scalars and one character 
string. The value presumed to be in each structure element is shown in hexadecimal in the 
comments and are used below to show how the bytes that comprise each structure element 
are mapped into memory. 

struct { 



int 


a; 


/*x'11121314' 


word 


*l 


double 


b; 


/*x'212223242225262728 


doubleword 


*/ 


char * 


c; 


/*x'3 132334 


word 


*/ 


char 


d[7]; 


f* 'A','B','CVD','E','FVG' 


array of bytes 


*/ 


short 


e; 


/*x'5152' 


halfword 


*/ 


int 


f; 


/*x '6 1626364' 


word 


*/ 



}s; 

Note that the C structure mapping introduces padding (skipped bytes) in the map in order 
to align the scalars on their proper boundaries — 4 bytes between a and h, one byte between 
d and e, and two bytes between e and/. Both big- and little-endian mappings use the same 
amount of padding. 

2.4.4.1 Big-Endian Mapping 

The big-endian mapping of a structure S is shown in Figure 2-32. Addresses are shown in 
hexadecimal at the left of each double word and in small figures below each byte. The 
content of each byte, as shown in the preceding C programming example, is shown in 
hexadecimal as characters for the elements of the string. 
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Figure 2-32. Big-Endian l\/!apping of Structure S 

2.4.4.2 Llttle-Endian Mapping 

Figure 2-33 shows the structure, S, using little-endian mapping. Double words are laid out 
from right to left. 
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Figure 2-33. Little-Endian Mapping of Structure S 

2.4.5 PowerPC Byte Ordering 

The default mapping for PowerPC processors is big-endian. Little-endian mode can be 
selected after a hard reset by setting the LM bit in the HIDO register in the MPC601 through 
the use of the mtspr instruction in the hard reset handler. The location of the bit is unique 
for each PowerPC processor. 

2.4.6 PowerPC Data Memory with LM Set 

One might expect that with the LM bit set (little-endian mode), that the system would have 
to perform two-, four-, or eight-way byte swaps when transferring a half word, word, or 
double word between memory and a register. However, the PowerPC architecture emulates 
little-endian byte ordering by manipulating the three low-order bits of the effective address. 
No bytes are swapped and individual multiple-byte scalars appear in memory in big-endian 
order. Setting LM adjusts the way effective addresses are computed without affecting the 
transfer of data between memory and registers, which is unencumbered by the need for 
multiplexers to swap bytes. 

2.4.6.1 Aligned Scalars 

For the load and store instructions listed in Table 2-28, the effective address is computed as 
specified in the instruction descriptions in Chapter 3, "Addressing Modes and Instruction 
Set Summary," and is modified as shown in Table 2-29. 

Table 2-28. Load/Store Instructions for Data Aligned on Natural Boundaries 



Mnemonic 


Instruction 


Ibz 


Load Byte and Zero 


ibzu 


Load Byle and Zero with Update 


Ibzux 


Load Byte and Zero with Update Indexed 


Ibzx 


Load Byte and Zero Indexed 


Ifd 


Load Floating-Point Double-Precision 


Ifdu 


Load Floating-Point Double-Precision with Update 
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Table 2-28. Load/Store Instructions for Data Aligned on Natural Boundaries 



Mnemonic 


Instruction 


Ifdux 


Load Floating-Point Double-Precision with Update Indexed 


Ifdx 


Load Floating-Point Double-Precision Indexed 


Ifs 


Load Floating-Point Single-Precision 


tfsu 


Load Floating-Point Single-Precision with Update 


Ifsux 


Load Floating-Point Single-Precision with Update Indexed 


Ifsx 


Load Floating-Point Single-Precision Indexed 


Iha 


Load Half Word Algebraic 


Ihau 


Load Half Word Algebraic with Update 


Ihaux 


Load Half Word Algebraic with Update Indexed 


Ihax 


Load Half Word Algebraic Indexed 


Ihbrx 


Load Half Word Byte-Reverse Indexed 


Ihz 


Load Half Word and Zero 


Ihzu 


Load Half Word and Zero with Update 


Ihzux 


Load Half Word and Zero with Update Indexed 


Ihzx 


Load Half Word and Zero Indexed 


Iwa 


Load Word Algebraic 


Iwarx 


Load Word and Reserve Indexed 


Iwaux* 


Load Word Algebraic with Update Indexed 


Iwax* 


Load Word Algebraic Indexed 


Iwbrx 


Load Word Byte-Reverse Indexed 


Iwz 


Load Word and Zero 


Iwzu 


Load Word and Zero with Update 


Iwzux 


Load Word and Zero with Update Indexed 


Iwzx 


Load Word and Zero Indexed 


stb 


Store Byte 


stbu 


Store Byte with Update 


stbux 


Store Byte with Update Indexed 


stbx 


Store Byte Indexed 


stfd 


Store Floating-Point Double-Precision 


stfdu 


Store Floating-Point Double-Precision with Update 


stfdux 


Store Floating-Point Double-Precision with Update Indexed 


stfdx 


Store Floating-Point Double-Precision Indexed 



2-48 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 



Table 2-28. Load/Store Instructions for Data Aligned on Natural Boundaries 



Mnemonic 


Instruction 


sttiwx* 


Store Floating-Point as Integer Word Indexed 


stfs 


Store Floating-Point Single-Precision 


stfsu 


Store Floating-Point Single-Precision with Update 


stfsux 


Store Floating-Point Single-Precision with Update Indexed 


stfsx 


Store Floating-Point Single-Precision Indexed 


sth 


Store Half Word 


sthbrx 


Store Half Word Byte-Reverse Indexed 


sthu 


Store Half Word with Update 


sthux 


Store Half Word with Update Indexed 


sthx 


Store Half Word Indexed 


stw 


Store Word 


stwbrx 


Store Word Byte-Reverse Indexed 


stwcx. 


Store Word Conditional Indexed 


stwu 


Store Word with Update 


stwux 


Store Word with Update Indexed 


stwx 


Store Word Indexed 



*Not implemented in the MPC601 

Table 2-29 shows how the EA is modified. 

Table 2-29. EA Modifications 



Data Width (Bytes) 


EA Modification 


8 


No change 


4 


XORwithb'100' 


2 


XORwithb'110' 


1 


XORwithb'lir 



The modified EA is passed to the data cache or the main memory and the specified width 
of the data is transferred between a GPR or FPR and the (as modified) addressed memory 
locations. Although the data is stored using big-endian byte ordering (but not in the same 
bytes within double words as with LM = 0), the modification of the EA makes it appear to 
the processor that it is stored in little-endian mode. 

The structure S would be placed in memory as shown in Figure 2-34. 
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Figure 2-34. PowerPC Little-Endian Structure S in l\/lemory or Cache 

Because of the modifications on the EA, the same structure S appears to the processor to be 
mapped into memory this way when LM = 1 (little-endian enabled). This is shown in 
Figure 2-35. 
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Figure 2-35. PowerPC Little-Endian Structure Sas Seen by Processor 

Note that as seen by the program executing in the processor, the mapping for the structure 
S is identical to the little-endian mapping shown in Figure 2-33. From outside of the 
processor, the addresses of the bytes making up the structure S are as shown in Figure 2-34. 
These addresses match neither the big-endian mapping of Figure 2-32 or the little-endian 
mapping of Figure 2-33. This must be taken into account when performing I/O operations 
in little-endian mode; this is discussed in Section 2.4.8, "PowerPC Input/Output in Little- 
Endian Mode." 

2.4.6.2 Misaligned Scaiars 

Performing an XOR operation on the low-order bits of the address of a scalar requires the 
scalar to be aligned on a boundary equal to a multiple of its length. When executing in little- 
endian mode (LM = 1), the MPC601 takes an alignment exception whenever any of the load 
and store instructions listed in Table 2-28 is issued with a misaligned EA, regardless of 
whether such an access could be handled without causing an exception in big-endian mode 
(LM = 0). 
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The PowerPC architecture defines that half words, words, and double words be placed in 
memory such that the little-endian address of the lowest-order bit is the EA computed by 
the load or store instruction; the little-endian address of the next-lowest-order byte is one 
greater, and so on. Figure 2-36 shows a four-byte word stored at little-endian address 5. The 
word is presumed to contain the binary representation ofx'11121314'. 
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Figure 2-36. PowerPC Little-Endian Mode, Word Stored at Address 5 

Figure 2-37 shows the same word stored by a little-endian program, as seen by the memory 
system (assuming big-endian mode). 
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Figure 2-37. Word Stored at Little-Endian Address 5 as Seen by Big-Endian 

Addressing 

Note that the misaligned word in this example spans two double words. The two parts of 
the misaligned word are not contiguous in the big-endian addressing space. 

An implementation may choose to support only a subset of misaligned little-endian 
memory accesses. For example, misaligned little-endian accesses contained within a single 
double word may be supported, while those that span double words may cause alignment 
exceptions. 

2.4.6.3 Non-Scaiars 

The PowerPC architecture has two types of instructions that handle non-scalars (multiple 
instances of scalars). Neither type can deal with the modified EAs required in little-endian 
mode and both types cause alignment exceptions. 

2.4.6.3.1 String Operations 

The load and store string instructions, listed in Figure 2-31, cause alignment exceptions 
when they are executed in little-endian mode (HIDO[LM] = 1) 
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Table 2-30. Load/Store String Instructions that Take Alignment Exceptions if LM = 1 



Mnemonic 


Description 


Iswi 


Load String Word Immediate 


Iswx 


Load String Word Indexed 


stswi 


Store String Word Immediate 


stswx 


Store String Word Indexed 


Iscbx 


Load String and Compare Byte Indexed 



String accesses are inherently misaligned; they transfer word-length quantities between 
memory (cache) and registers, but the quantities are not necessarily aligned on word 
boundaries. 

Note that the system software must determine whether to emulate the excepting instruction 
or treat it as an illegal operation. Because little-endian mode programs are new with respect 
to the PowerPC architecture — that is, they are not POWER binaries — having the compiler 
generate these instructions in little-endian mode would be slower than processing the string 
in-line or by using a subroutine call. 

2.4.6.3.2 Load and Store Multiple Instructions 

The following instructions cause alignment exceptions when executed in little-endian 
mode(HID()[LM] = l). 

Table 2-31. Load/Store Multiple Instructions that Take Alignment Exceptions if 

LM=1 



Mnemonic 


Instruction 


Imw 


Load Multiple Word 


stmw 


Store Multiple Word 



Although the words addressed by these instructions are on word boundaries, each word is 
in the half of its containing double word opposite from where it would be in big-endian 
mode. 

Note that the system software must determine whether to emulate the excepting instruction 
or treat it as an illegal operation. Because little-endian mode programs are new with respect 
to the PowerPC architecture — ^that is, they are not POWER binaries — having the compiler 
generate these instructions in little-endian mode would be slower than processing the string 
in-line or by using a subroutine call. 
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2.4.7 PowerPC Instruction Memory Addressing in Little-Endian 
IVlode 

Each PowerPC instruction occupies 32 bits (one word) of memory. PowerPC processors 
fetch and execute instructions as if the current instruction address had been advanced one 
word for each sequential instruction. When operating with LM = 1, the address is modified 
according to the Uttle-endian rule for fetching word-length scalars; that is, it is XORed with 
b'lOO'. A prograin is thus an array of little-endian words with each word fetched and 
executed in order (not including branches). 



Consider the following example: 


loop: 










cmpiwi 


r5,0 






beq 


done 






Iwzux 


r4, r5, r6 






add 


r7, r7, r4 






subi 


r5, 1 






b 


loop 





done: 



stw 



r7, total 



Assuming the program starts at address 0, these instructions are mapped into memory for 
big-endian execution as shown in Figure 2-38. 
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loop: cmpiwi r5, 8 




beq done 
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08 09 OA OB 


OC 


OD OE OF 


10 


subi r5, 1 
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Figure 2-38. PowerPC Big-Endian, Instruction Sequence as Seen by Processor 

If this same program is assembled for and executed in Uttle-endian mode, the mapping seen 
by the processor appears as shown in Figure 2-39. 

Each machine instruction appears in memory as a 32-bit integer containing the value 
described in the instruction description, regardless of whether LM is set. This is because 
scalars are always mapped in memory in big-endian byte order. 
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beq done 




loop: cmpiwi 
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Figure 2-39. PowerPC Little-Endian, Instruction Sequence as Seen by Processor 

When little-endian mapping is used, all references to the instruction stream must follow 
little-endian addressing conventions, including addresses saved in system registers when 
the exception is taken, return addresses saved in the link register, and branch displacements 
and addresses. 

• An instruction address placed in the link register by branch and link, or an 
instruction address saved in an SPR when an exception is taken is the address that a 
program executing in little-endian mode would use to access the instruction as a 
word of data using a load instruction. 

• An offset in a relative branch instruction reflects the difference between the 
addresses of the instructions, where the addresses used are those that a program 
executing in little-endian mode would use to access the instructions as data words 
using a load instruction. 

• A target address in an absolute branch instruction is the address that a program 
executing in little-endian mode would use to access the target instruction as a word 
of data using a load instruction. 

2.4.8 PowerPC Input/Output in Little-Endian Mode 

Input/output operations, such as writing the contents of a memory page to disk, transfers a 
byte stream on both big- and little-endian systems. For the disk transfer, byte of the page 
is written to the first byte of a disk record and so on. 

For a PowerPC system running in big-endian mode, both the processor and the memory 
subsystem recognize the same byte as byte 0. However, this is not true for a PowerPC 
system running in little-endian mode because of the modification of the three low-order bits 
when the processor accesses memory. 

In order for I/O transfers in little-endian mode to appear to transfer bytes properly, they 
must be performed as if the bytes transferred were accessed one at a time, using the Httle- 
endian address modification appropriate for the single-byte transfers (XOR the bits with 
b'lll '. This does not mean that I/O on little-endian PowerPC machines must be done using 
only one-byte-wide transfers. Data transfers can be as wide as desired, but the order of the 
bytes within double words must be as if they were fetched or stored one at a time. 
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Note that not all I/O operations performed in PowerPC systems is for large blocks as 
described above. I/O operations can be performed with certain devices by merely storing 
to or loading from addresses that are associated with the devices (this is referred to as I/O 
controller interface operations). Care must be taken with such operations when defining the 
addresses to be used because these addresses are subjected to the EA modifications 
described in Table 2-29. A load or store that maps to a control register on a device may 
require the bytes of the value transferred to be reversed. If this reversal is required, the loads 
and stores with byte reversal instructions may be used. 

2.4.9 Floating-Point Execution IVIodels 

The IEEE-754 standard includes 32-bit and 64-bit arithmetic. The standard requires that 
single-precision arithmetic be provided for single-precision operands. The standard permits 
double-precision arithmetic instructions to have either (or both) single-precision or double- 
precision operands, but states that single-precision arithmetic instructions should not 
accept double-precision operands. 

The PowerPC architecture follows these guidelines: 

• Double-precision arithmetic instructions can have operands of either or both 
precisions 

• Single-precision arithmetic instructions require all operands to be single-precision 

• Double-precision arithmetic instructions produce double-precision values 

• Single-precision arithmetic instructions produce single-precision values 

For arithmetic instructions, conversions from double- to single-precision must be done 
explicitiy by software, while conversions from single- to double-precision are done 
implicitly. 

All implementations of the PowerPC architecture provide the equivalent of the following 
execution models to ensure that identical results are obtained. Definition of the arithmetic 
instructions for infinities, denormalized numbers, and NaNs follow conventions described 
in following sections. 

Although the double-precision format specifies an 11 -bit exponent, exponent arithmetic 
uses two additional bit positions to avoid potential transient overflow conditions. An extra 
bit is required when denormalized double-precision numbers are prenormalized. A second 
bit is required to permit computation of the adjusted exponent value in the following cases 
when the corresponding exception enable bit is one: 

• Underflow during multiplication using a denormalized factor. 

• Overflow during division using a denormalized divisor. 

2.4.9.1 Execution l\/lodel for IEEE Operations 

The following description uses 64-bit arithmetic as an example; 32-bit arithmetic is similar 
except that the fraction field is a 23-bit field and the single-precision guard, round, and 
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sticky bits (described in this section) are logically adjacent to the 23 -bit FRACTION (or 
mantissa) field. 

The bits and fields for the IEEE 64-bit execution model are defined as follows: 

• The S bit is the sign bit. 

• The C bit is the carry bit that captures the carry out of the significand. 

• The L bit is the leading unit bit of the significand which receives the implicit bit from 
the operands. 

• The FRACTION is a 52-bit field, which accepts the fraction (mantissa) of the 
operands. 

• The guard (G), round (R), and sticky (X) bits are extensions to the low-order bits of 
the accumulator. The G and R bits are required for post normalization of the result. 
The G, R, and X bits are required during rounding to determine if the intermediate 
result is equally near the two nearest representable values. The X bit serves as an 
extension to the G and R bits by representing the logical OR of all bits that may 
appear to the low-order side of the R bit, either due to shifting the accumulator right 
or other generation of low-order result bits. The G and R bits participate in the left 
shifts with zeros being shifted into the R bit. Table 2-32 shows the significance of 
the G, R, and X bits with respect to the intermediate result (IR), the next lower in 
magnitude representable number (NL), and the next higher in magnitude 
representable number (NH), 



Table 2-32. Interpretation of G, R, and X Bits 



G 


R 


X 


Interpretation 











IR is exact 








1 


IR closer to NL 





1 








1 


1 


1 








IR midway between NL & NH 


1 





1 


IR closer to NH 


1 


1 





1 


1 


1 



The significand of the intermediate result is made up of the L bit, the FRACTION, and the 
G, R, and X bits. 

The infinitely precise intermediate result of an operation is the result normalized in bits L, 
FRACTION, G, R, and X of the floating-point accumulator. 

Before results are stored into an FPR, the significand is rounded if necessary, using the 
rounding mode specified by FPSCR[RN]. If rounding causes a carry into C, the significand 
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is shifted right one position and the exponent is incremented by one. This may yield an 
inexact result and possibly exponent overflow. Fraction bits to the left of the bit position 
used for rounding are stored into the FPR, and low-order bit positions, if any, are set to zero. 

Four rounding modes are provided which are user-selectable through FPSCR[RN] as 
described in Section 2.4.9.6, "Rounding." For rounding, the conceptual guard, round, and 
sticky bits are defined in terms of accumulator bits. 

Table 2-33 shows the positions of the guard, round, and sticky bits for double-precision and 
single-precision floating-point numbers. 

Table 2-33. Location of the Guard, Round and Sticky Bits 



Formal 


Guard 


Round 


Sticky 


Double 


Gbit 


Rbit 


Xbit 


Single 


24 


25 


26-52 G.R.X 



Rounding can be treated as though the signiticand were shifted right, if required, until the 
least significant bit to be retained is in the low-order bit position of the FRACTION. If any 
of the guard, round, or sticky bits are non-zero, the result is inexact. 

ZI and Z2, defined in Section 2.4.9.6, "Rounding," can be used to approximate the result 
in the target format when one of the following rules is used: 

• Round to nearest 

— Guard bit = 0: The result is truncated. (Result exact (GRX = 000) or closest to 
next lower value in magnitude (GRX = 001, 010, or Oil) 

— Guard bit = 1 : Depends on round and sticky bits: 

Case a: If the round or sticky bit is one (inclusive), the result is incremented, 
(result closest to next higher value in magnitude (GRX = 101, 1 10, or 1 1 1)) 

Case b: If the round and sticky bits are zero (result midway between closest 
representable values) then if the low-order bit of the result is one, the result is 
incremented. Otherwise (the low-order bit of the result is zero) the result is 
truncated (this is the case of a tie rounded to even). 

• If during the round to nearest process, truncation of the unrounded number produces 
the maximum magnitude for the specified precision, the following action is taken: 

— Guard bit = 1 : Store infinity with the sign of the unrounded result. 

— Guard bit = 0: Store the truncated (maximum magnitude) value. 
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• Round toward zero — Choose the smaller in magnitude of Zl or Z2. If the guard, 
round, or sticky bit is non-zero, the result is inexact. 

• Round toward +infinity 
Choose Zl. 

• Round toward -infinity 
Choose Z2. 

Where the result is to have fewer than 53 bits of precision because the instruction is 
a floating round to single-precision or single-precision arithmetic instruction, the 
intermediate result either is normalized or is placed in correct denormalized form 
before the result is potentially rounded. 

2.4.9.1.1 Execution Model for Multiply-Add Type Instructions 

The PowerPC architecture makes use of a special form of instruction that performs up to 
three operations in one instruction (a multiply, an add, and a negate). With this added 
capability is the special feature of being able to produce a more exact intermediate result as 
an input to the rounder. The 32-bit arithmetic is similar except that the fraction field is 
smaller. Note that the rounding occurs only after add; therefore, the computation of the sum 
and product together are infinitely precise before the final result is rounded to a 
representable format. 

The first part of the operation is a multiply. The multiply has two 53-bit significands as 
inputs, which are assumed to be prenormalized, and produces a result conforming to the 
above model. If there is a carry out of the significand (into the C bit), the significand is 
shifted right one position, placing the L bit into the most significant bit of the FRACTION 
and placing the C bit into the L bit. All 106 bits (L bit plus the fraction) of the product take 
part in the add operation. If the exponents of the two inputs to the adder are not equal, the 
significand of the operand with the smaller exponent is aligned (shifted) to the right by an 
ainount added to that exponent to make it equal to the other input's exponent. Zeros are 
shifted into the left of the significand as it is aligned and bits shifted out of bit 105 of the 
significand are ORed into the X' bit. The add operation also produces a result conforming 
to the above model with the X' bit taking part in the add operation. 

The result of the add is then normalized, with all bits of the add result, except the X' bit, 
participating in the shift. The normalized result provides an intermediate result as input to 
the rounder that conforms to the model described in Section 2.4.9.1 , "Execution Model for 
IEEE Operations," where: 

• The guard bit is bit 53 of the intermediate result. 

• The round bit is bit 54 of the intermediate result. 

• The sticky bit is the OR of all remaining bits to the right of bit 55, inclusive. 

If the instruction is floating negative multiply-add or floating negative multiply-subtract, 
the final result is negated. 
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Status bits are set to reflect the result of the entire operation: for example, no status is 
recorded for the result of the multiplication part of the operation, 

2.4.9.2 Floating-Point Data Format 

The PowerPC architecture defines the representation of a floating-point value in two 
different binary, fixed-length formats. The format may be a 32-bit format for a single- 
precision floating-point value or a 64-bit format for a double-precision floating-point value. 
The single-precision format may be used for data in memory. The double-precision format 
can be used for data in memory or in floating-point registers. 

The length of the exponent and the fraction fields differ between these two precision 
formats. The structure of the single-precision format is shown in Figure 2-40; the structure 
of the double-precision format is shown in Figure 2-41. 



s 


EXP 


FRACTION 



18 9 31 

Figure 2-40. Floating-Point Single-Precision Format 



EXP 



FRACTION 



1 11 12 63 

Figure 2-41. Floating-Point Double-Precision Format 

Values in floating-point format consist of three fields: 

• S (sign bit). 

• EXP (exponent+bias) 

• FRACTION (fraction) 

If only a portion of a floating-point data item in memory is accessed, as with a load or store 
instruction for a byte of halfword (or word in the case of floating-point double-precision 
format), the value affected depends on whether the PowerPC system is using big- or little- 
endian byte ordering, which is described in Section 2.4.3, "Byte and Bit Ordering." Big- 
endian mode is the default. 

The^significand consists of a leading implied bit concatenated on the right with the 
FRACTION. This leading implied bit is a 1 for normalized numbers and a for 
denormaUzed numbers in the unit bit position (that is, the first bit to the left of the binary 
point). Values representable within the two floating-point formats can be specified by the 
parameters listed in Table 2-34. 
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Table 2-34. IEEE Floating-Point Fields 


Parameter 


Single-Precision 


Double-Precision 


Exponent bias 


+127 


+1023 


Maximum exponent 
(unbiased) 


+127 


+1023 


Minimum exponent 


-126 


-1022 


Format width 


32 bits 


64 bits 


Sign width 


1 bit 


1 bit 


Exponent width 


8 bits 


11 bits 


Fraction width 


23 bits 


52 bits 


Significand width 


24 bits 


53 bits 



The exponent is expressed as an 8-bit value for single-precision numbers or an 1 1-bit value 
for double-precision numbers. These bits hold the biased exponent; the true value of the 
exponent can be determined by subtracting 127 for single-precision numbers and 1023 for 
double-precision values. This is shown in Figure 2-42. Note that using a bias eliminates the 
need for a sign bit. The highest-order bit is used both to generate the number, and is an 
implicit sign bit. Note also that two values are reserved — all bits set indicates that the 
number is an infinity or NaN and all bits cleared indicates that the number is either zero or 
denormalized. 

2.4.9.2.1 Value Representation 

The PowerPC architecture defines numerical and non-numerical values represen table 
within single- and double-precision formats. The numerical values are approximations to 
the real numbers and include the normalized numbers, denormalized numbers, and zero 
values. The non-numerical values representable are the positive and negative infinities, and 
the NaNs. The positive and negative infinities are adjoined to the real numbers but are not 
numbers themselves, and the standard rules of arithmetic do not hold when they appear in 
an operation. They are related to the real numbers by "order" alone. It is possible however 
to define restricted operations among numbers and infinities as defined below. The relative 
location on the real number line for each of the defined entities is shown in Figure 2-43. 
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Biased Exponent 
(binary) 


Single-Precision 
(unbiased) 


Double-Precision 
(unbiased) 




11 11 


Reserved for Infinities and NaNs 






f 


11 10 


+127 


+1023 




< 




11 01 


+126 


+1022 


Positive 


























\ 


10 00 


1 


1 


Zero 


01 11 














< 


^ 


01 10 


-1 


-1 










Negative 












K 












00 01 


-126 


-1022 




00 00 


Reserved for Zeros and Denormalized Numbers 



Figure 2-42. Biased Exponent Format 



Unrepresentable, small numbers 



-INF 



-NORM 



-DENORfVl 



■Tiny- 



+0 



(■DENORM 



-Tiny- 



hNORM 



hINF 



Figure 2-43. Approximation to Real Numbers 

The positive and negative NaNs are not related to the numbers or ±oo by order or value, but 
they are encodings that convey diagnostic information such as the representation of 
uninitialized variables. Table 2-35 describes each of the floating-point formats. 





Table 2-35. Recognized Floating-Point Numbers 




Sign Bit 


Exponent (Biased) 


Leading Bit 


Mantissa 


Value 





Maximum 


x 


Non-zero 


+NaN 





Maximum 


X 


Zero 


+lntinity 





< Exponent < Maximum 


1 


Non-zero 


+Normalized 











Non-zero 


+Denormalized 











Zero 


+0 


1 








Zero 


-0 



IVIOTOROLA 



Chapter 2. Registers and Data Types 



2-61 



Table 2-35. Recognized Floating-Point Numbers (Continued) 



Sign Bit 


Exponent (Biased) 


Leading Bit 


Mantissa 


Value 


1 








Non-zero 


-Denormaiized 


1 


< Exponent < Maximum 


1 


Non-zero 


-Normalized 


1 


Maximum 


x 


Zero 


-Infinity 


1 


Maximum 


X 


Non-zero 


-NaN 



The following sections describe floating-point values defined in the architecture: 

2.4.9.2.2 Binary Floating-Point Numbers 

Binary floating-point numbers are machine-representable values used to approximate real 
numbers. Three categories of numbers are supported: normalized numbers, denormaiized 
numbers, and zero values. 

2.4.9.2.3 Normalized Numbers (±NORM) 

The values for normalized numbers have a biased exponent value in the range: 

• 1-254 in single-precision format 

• 1-2046 in double-precision format 

The implied unit bit is one. Normalized numbers are interpreted as follows: 
NORM = (-1 )^ X 2^ X (1 .fraction) 

where (s) is the sign, (E) is the unbiased exponent and (1. fraction) is the significand 
composed of a leading unit bit (implied bit) and a fractional part. The format for normalized 
numbers is shown in Figure 2-44. 



MIN<EXPONENT<MAX 
(BIASED) 



MANTISSA=ANY BIT PATTERN 



SIGN OF MANTISSA, OR 1 



Figure 2-44. Format for Normalized Numbers 

The ranges covered by the magnitude (M) of a normalized floating-point number are 
approximately equal to the following: 

Single-precision format: 

1.2x10-^^ <M< 3.4x10^^ 
Double-precision format: 

2.2x10-^°^ <M< 1.8x10^°^ 
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2.4.9.2.4 Zero Values (±0) 

Zero values have a biased exponent value of zero and a fraction value of zero. This is shown 
in Figure 2-45. Zeros can have a positive or negative sign. The sign of zero is ignored by 
comparison operations (that is, comparison regards +0 as equal to -0). 



EXPONENT=0 
(BIASED) 



MANTISSA=0 



SIGN OF MANTISSA, OR 1 

Figure 2-45. Format for Zero Numbers 

2.4.9.2.5 Denormalized Numbers (±DENORM) 

Denormalized numbers have a biased exponent value of zero and a non-zero fraction value. 
The format for denormalized numbers is shown in Figure 2-46. 



EXPONENT=0 
(BIASED) 



MANTISSA=ANY NON-ZERO 
BIT PATTERN 



SIGN OF MANTISSA, OR 1 



Figure 2-46. Format for Denormalized Numbers 

Denormalized numbers are non-zero numbers smaller in magnitude than the representable 
normalized numbers. They are values in which the implied unit bit is zero. Denormalized 
numbers are interpreted as follows: 

DENORM = (-1)^ X 2^"^'" x (O.fraction) 

Emin is the minimum representable exponent value (-126 for single-precision, -1022 for 
double-precision). 

2.4.9.2.6 Infinities (±oo) 

Positive and negative infinities have the maximum biased exponent value: 

• 255 in the single-precision format 

• 2047 in the double-precision format 

The format for infinities is shown in Figure 2-47. 



EXPONENT=MAXIMUM 
(BIASED) 



MANTISSA=0 



SIGN OF MANTISSA, OR 1 



Figure 2-47. Format for Positive and Negative Infinities 

The fraction value is zero. Infinities are used to approximate values greater in magnitude 
than the maximum normalized value. Infinity arithmetic is defined as the limiting case of 
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real arithmetic, witii restricted operations defined between numbers and infinities. Infinities 
and the reals can be related by ordering in the affine sense: 

-oo < every finite number < +^ 

Arithmetic using infinite numbers is always exact and does not signal any exception, except 
when an exception occurs due to the invalid operations as described in Section 5.4.7,2, 
"Invalid Operation Exception Conditions." 

2.4.9.2.7 Not a Numbers (NaNs) 

NaNs have the maximum biased exponent value and a non-zero fraction value. The format 
for NaNs is shown in Figure 2-48. The sign bit of NaNs is ignored (that is, NaNs are neither 
positive nor negative). If the high-order bit of the fraction field is a zero, the NaN is a 
signaling NaN; otherwise it is a quiet NaN (QNaN). 



EXPONENT=MAXIMUM 
(BIASED) 



MANTISSA=ANY NON-ZERO 
BIT PATTERN 



SIGN OF MANTISSA (0 for +NaN); 1 for -NaN) 

Figure 2-48. Format for NANs 

Signaling NaNs signal exceptions when they are specified as arithmetic operands. 

Quiet NaNs represent the results of certain invalid operations, such as invalid arithmetic 
operations on infinities or on NaNs, when the invalid operation exception is disabled 
(FPSCR[VE]=0). QNaNs are generated under the following conditions: 

• An invalid operation occurs and FPSCR[VE] = 

• An mffs instruction is executed and the upper 32 bits are undefined (only in the 
MPC601). 

• On Floating Convert to Integer with Round (fctir) and Floating Convert to Integer 
with Round toward Zero (fctirz) the PowerPC architecture defines bits ()-31 of the 
target floating point register as undefined. In the MPC601, these bits take on the 
value x'FFFS 00()()' (which is the representation for a QNaN). 

Quiet NaNs propagate through all operations, except ordered comparison and conversion 
to integer operations without signalling exceptions. Specific encodings in QNaNs can thus 
be preserved through a sequence of operations and used to convey diagnostic information 
to help identify results from invalid operations. 

When a QNaN results from an operation because an operand is a NaN or because a QNaN 
is generated due to a disabled invalid operation exception, the following rule is applied to 
determine the QNaN with the high-order fraction bit set to one that is to be stored as the 
result: 

If(frA)isaNaN 
Then frD <- (frA) 
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Else if (frB) is a NaN 
Then frD <- (frB) 
Else if (frC) is a NaN 
Then frD ^ (frC) 
Else if generated QNaN 

Then frD <— generated QNaN 

If the operand specified by fr A is a NaN, that NaN is stored as the result. Otherwise, if the 
operand specified by frB is a NaN (if the instruction specifies an frB operand), that NaN is 
stored as the result. Otherwise, if the operand specified by frC is a NaN (if the instruction 
specifies an frC operand), that NaN is stored as the result. Otherwise, if a QNaN is 
generated by a disabled invalid operation exception, that QNaN is stored as the result. If a 
QNaN is to be generated as a result, the QNaN generated has a sign bit of zero, an exponent 
field of all ones, and a high-order fraction bit of one with all other fraction bits zero. An 
instruction that generates a QNaN as the result of a disabled invalid operation generates this 
QNaN. This is shown in Figure 2-49. 






111...1 


1000....0 







- SIGN OF MANT 


SSA, NaN) OR 1 





Figure 2-49. Representation of QNaN 

2.4.9.3 Sign of Result 

The following rules govern the sign of the result of an arithmetic operation, when the 
operation does not yield an exception. These rules apply even when the operands or results 
are ±{) or ±oo: 

• The sign of the result of an addition operation is the sign of the source operand 
having the larger absolute value. The sign of the result of the subtraction operation, 
x-y, is the same as the sign of the result of the addition operation, x+(-y). 

• When the sum of two operands with opposite sign, or the difference of two operands 
with the saine sign, is exactly zero, the sign of the result is positive in all rounding 
modes except round toward negative Infinity(-oo), in which case the sign is negative. 

• The sign of the result of a multiplication or division operation is the exclusive OR 
of the signs of the source operands. 

• The sign of the result of a round to single-precision or convert to/from integer 
operation is the sign of the source operand. 

For multiply-add instructions, these rules are applied first to the multiplication operation 
and then to the addition or subtraction operation (one of the source operands to the addition 
or subtraction operation is the result of the multiplication operation). 
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2.4.9.4 Normalization and Denormalization 

When an arithmetic operation produces an intermediate result, consisting of a sign bit, an 
exponent, and a non-zero significand with a zero leading bit, the result is not a normalized 
number and must be normalized before it is stored. 

A number is normalized by shifting its significand left while decrementing its exponent by 
one for each bit shifted, until the leading significand bit becomes one. The guard bit and the 
round bit participate in the shift with zeros shifted into the round bit; see Section 2.4.9. 1 , 
"Execution Model for IEEE Operations." During normalization, the exponent is regarded 
as if its range were unlimited. If the resulting exponent value is less than the minimum 
value that can be represented in the format specified for the result, the intermediate result 
is said to be "tiny" and the stored result is determined by the rules described in Section 
5.4.7.5, "Underflow Exception Condition." The sign of the number does not change. 

When an arithmetic operation produces a non-zero intermediate result whose exponent is 
less than the minimum value that can be represented in the format specified, the stored 
result may need to be denormalized. The result is determined by the rules described in 
Section 5.4.7.5, "Underflow Exception Condition." 

A number is denormalized by shifting its significand to the right while incrementing its 
exponent by one for each bit shifted until the exponent equals the format's minimum value. 
If any significant bits are lost in this shifting process then "Loss of Accuracy" has occurred 
and an underflow exception is signaled. The sign of the number does not change. 

When denormalized numbers are operands of multiply and divide operations, operands are 
prenormalized internally before performing the operations. 

2.4.9.5 Data Handling and Precision 

There are specific instructions for moving floating-point data between the FPRs and 
memory. For double-precision format data, the data is not altered during the move. For 
single-precision data, the format is converted to double-precision format when data is 
loaded from memory into an FPR. A format conversion from double- to single-precision is 
performed when data from an FPR is stored. Floating-point exceptions cannot occur during 
these operations. 

All arithmetic operations use floating-point double-precision format. 

Floating-point single-precision formats are used by the following four types of instructions: 

• Load Floating-Point Single-Precision (Ifs) — ^This instruction accesses a single- 
precision operand in single-precision format in memory, converts it to double- 
precision, and loads it into an FPR. Exceptions are not detected during the load 
operation. 

• Round to floating-point single-precision — If the operand is not already in single- 
precision range, the floating round to single-precision instruction rounds a double- 
precision operand to single-precision, checking the exponent for single-precision 
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range and handling any exceptions according to respective enable bits in the FPSCR. 
The instruction places that operand into an FPR as a double-precision operand. For 
results produced by single-precision arithmetic instructions and by single-precision 
loads, this operation does not alter the value. 

• Single-precision arithmetic instructions — These instructions take operands from the 
FPRs in double-precision format, performs the operation as if it produced an 
intermediate result correct to infinite precision and with unbounded range, and then 
forces this intermediate result to fit in single-precision format. Status bits in the 
FPSCR and in the condition register are set to reflect the single-precision result. The 
result is then converted to double-precision format and placed into an FPR. The 
result falls within the range supported by the single format. 

• For single-precision operations, source operands must be representable in single- 
precision format. If they are not, the result placed into the target FPR, and the setting 
of status bits in the FPSCR and in the condition register, are undefined. 

• Store Floating-Point Single-Precision (stfs) — ^This form of instruction converts a 
double-precision operand to single-precision format and stores that operand into 
memory. If the operand requires denormalization in order to fit in single-precision 
format, it is automatically denormalized prior to being stored. No exceptions are 
detected on the store operation (the value being stored is effectively assumed to be 
the result of an instruction of one of the preceding three types). 

When the result of a Load Floating-Point Single-Precision (Ifs), Floating-Point Round to 
Single-Precision (frspjc), or single-precision arithmetic instruction is stored in an FPR, the 
low-order 29 fraction bits are zero. This is shown in Figure 2-50. 



Bit 35 



EXP 



xxxxxxxxxxxxxxxxxxxxxxx 00000000000000000000000000000 



1 11 12 63 

Figure 2-50. Single-Precision Representation in an FPR 

The Floating-Point Round To Single-Precision (frpsx) instruction allows conversion from 
double- to single-precision with appropriate exception checking and rounding. This 
instruction should be used to convert double-precision floating-point values (produced by 
double-precision load and arithmetic instructions) to single-precision values before storing 
them into single-format memory elements or using them as operands for single-precision 
arithmetic instructions. Values produced by single-precision load and arithmetic 
instructions can be stored directly, or used directly as operands for single-precision 
arithmetic instructions, without preceding the store, or the arithmetic instruction, by frspjc. 

A single-precision value can be used in double-precision arithmetic operations. The reverse 
is true only if the double-precision value can be represented in single-precision format. 
Some implementations may execute single-precision arithmetic instructions faster than 
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double-precision arithmetic instructions. Therefore, if double-precision accuracy is not 
required, using single-precision data and instructions can speed operations. 

2.4.9.6 Rounding 

All arithmetic instructions defined by the PowerPC architecture produce an intermediate 
result considered infinitely precise. This result must then be written with a precision of 
finite length into an FPR. After normalization or denormalization, if the infinitely precise 
intermediate result cannot be represented in the precision required by the instruction, it is 
rounded before being placed into the target FPR. 

The instructions that potentially round their result are the arithmetic, multiply-add, and 
rounding and conversion instructions. As shown in Figure 2-51, whether rounding occurs 
depends on the source values. 



Rounding 


No 

s 




Fl=0 
FR=0 


No 

s 




\ 


Yes 






Fl = 1 




A 






/ Fraction n 


»- 


FR = 


Jncremented 


/ 





Yes 



Fl = 1 



Figure 2-51 . Rounding Flow Diagram 

Each of these instructions sets FPSCR bits FR and FT, according to whether rounding 
occurs (FI) and whether the fraction was incremented (FR). If rounding occurs, FI is set to 
one and FR may be either zero or one. If rounding does not occur, both FR and FI are 
cleared. Other floating-point instructions do not alter FR and FI. Four modes of rounding 
are provided that are user-selectable through the floating-point rounding control field in the 
FPSCR. See Section 2.2.3, "Hoating-Point Status and Control Register (FPSCR)." These 
are encoded as follows in Table 2-36. 

Let Z be the infinitely precise intermediate arithmetic result or the operand of a conversion 
operation. If Z can be represented exactly in the target format, no rounding occurs and the 
result in all rounding modes is equivalent to truncation of Z. If Z cannot be represented 
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Table 2-36. FPSCR Bit Settings— RN Field 



RN 


Rounding Mode 


00 


Round to nearest 


01 


Round toward zero 


10 


Round toward +infinity 


11 


Round toward -infinity 



exactly in the target format, let Zl and Z2 be the next larger and next smaller numbers 
representable in the target format that bound Z; then Zl or Z2 can be used to approximate 
the result in the target format. 

Figure 2-52 shows a graphical representation of Z, Zl , and Z2 in this case and Figure 2-53 
shows the selection of Zl and Z2 for the four rounding settings. 



By incrementing LSB of Z 
Infinitely precise value 
By truncating after LSB 



Z2 



Z1 



Z2 



Z1 



Negative values 



Positive values 



Figure 2-52. Relation of Zl and Z2 

Rounding follows the four following rules: 

• Round to nearest — Choose the best approximation (Zl or Z2. In case of a tie, choose 
the one which is even (least significant bit ())). 

• Round toward zero — Choose the smaller in magnitude (Zl or Z2). 

• Round toward +infinity — Choose Zl . 

• Round toward -infinity — Choose Z2. 

See Section 2.4.9.1, "Execution Model for IEEE Operations," for a detailed explanation of 
rounding. If Z is to be rounded up and Zl does not exist (that is, if there is no number larger 
than Z that is representable in the target format), then an overflow exception occurs if Z is 
positive and an underflow exception occurs if Z is negative. Similarly, if Z is to be rounded 
down and Z2 does not exist, then an overflow exception occurs if Z is negative and an 
underflow exception occurs if Z is positive. The results in these cases are defined in Section 
5.4.7.1, "Floating-Point Enabled Program Exceptions." 
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/^Z is infinitely \ 

precise result 
V or operand J 



Does Z fit 
target format? 



No 



Z1 <Z<Z2 



I 



No 



Round 
toward +00? 



No 



Round 
toward +«? 



No 



Round 
toward 0? 



No 



Round 
to nearest 



Yes 



Yes 



Yes 



Yes 



Rounding = Truncation 



Choose Z2 



Choose Z1 



Choose Z1 



Choose best approxi- 
mation (Z1 orZ2) 



if tie 



Choose even value (Z1 
or Z2 whose Isb is 0) 



Figure 2-53. Selection of Z1 and Z2 

2.5 Unimplemented PowerPC Registers 

The following PowerPC registers are not implemented in the MPC6()1 : 

• The time base SPRs are used in the PowerPC architecture instead of the RTC 
registers. The architected time base facility operates as a subdivision of the 
frequency provided by the processor clock. 

• Floating-point exception cause register (FPECR) — ^This is a supervisor-level SPR 
(1023) that is used by some implementations to determine the cause of a floating- 
point error. 
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• Address space register (ASR) — ^The ASR is a 64-bit SPR used in 64-bit 
implementations to perform address translations. 

• Each PowerPC processor implements a unique set of HID registers. Note that some 
of these registers may be implemented the same way in more than one PowerPC 
processor design. 

An mtspr or mfspr instruction that specifies an unimplemented register is treated as a no- 
op. If a privilege violation is indicated, the program exception has priority over the no-op. 
This can occur if a user-mode program tries to access a register with bit of the SPR 
encoding tield (in the instruction format) set. However, in this case the program exception 
is taken regardless of whether the SPR encoding specified an implemented register. 



2.6 Reset 

The following sections describe hard reset and soft reset in the MPC6()1 processor. For 
more information about the reset exception see Section 5.4.1, "Reset Exceptions 
(x'(){)l (){)')." 

2.6.1 Hard Reset 

The hard reset sequence begins when the hard reset signal HKESBT is negated after being 
driven as described in Section 8.2.9.4.1 , "Hard reset (HRESET)— Input." Note that a hard 
reset operation is required on power-on in order to properly reset the MPC601. 

Table 2-37 shows the state of the registers after a hard reset and before it fetches the first 
instruction from address x'FFFO 0100' the system reset exception vector. 



Table 2-37. Settings after Hard Reset (Used at Power-On) 


Register 


Setting 


Register 


Setting 


GPRs 


Alios 


SRR1 


00000000 


FPRs 


Alios 


SRGO 


00000000 


FPSCR 


00000000 


SRG1 


00000000 


Condition register 


Alios 


SRG2 


00000000 


Segment registers 


Alios 


SRG3 


00000000 


MSR 


00001040 


EAR 


00000000 


MQ 


00000000 


PVR 


00010001^ 


XER 


00000000 


BAT registers 


Alios 


RTCU^ 


00000000 


HIDO 


800100802 


RTCL^ 


00000000^ 


HID1 


00000000 


Link register 


00000000 


HID2 


00000000 


CTR 


00000000 


HID5 


00000000 


DSISR 


00000000 


HID15 


00000000 
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Table 2-37. Settings after Hard Reset (Used at Power-On) (Continued) 



Register 


Setting 


Register 


Setting 


DAR 


00000000 


TLBs 


Alios 


DEC^ 


00000000 


Cache 


Alios 


SDR1 


00000000 


Tag directory 


All OS. (However, the LRU 
bits are initialized such 
that each side of the 
cache has a unique LRU 
value.) 


SRRO 


00000000 







Notes: ^ In the earliest release of the MPC601 (DD1), thisis 00010000. Later versions of the hardware maybe 

different. 
^ Master checkstop enable on, sequencer GPR self-test checkstop invalid microcode instruction checkstop on. 
^ Note that if external clock Is connected to RTC for the fvlPC601 , then the RTCL, RTCU, and DEC can change 

from their initial value of Os without receiving instructions to load those registers. 

The following is also true after a hard reset operation: 

• External checkstops are enabled. 

• The on-chip COP has given control of the PIs/POs to the rest of the chip for 
functional use. 

• Since the reset exception has data and instruction translation disabled (MSR[DT] 
and MSR[IT] both cleared), the chip operates in direct address translation mode. 
This implies that instruction fetches as well as loads and stores are cacheable. 
(Operations that correspond to direct address translations are implicitly cacheable, 
not write-through mode, and require coherency checking on the bus). 

• All internal arrays and registers are cleared during the hard reset process. 

2.6.2 Soft Reset 

Registers are not re-initialized when a soft reset occurs (SRbSHT is asserted as described 
in Section 8.2.9.4.2, "Soft Reset (SRESET)— Input"). The SRRO and SRRl registers are 
updated with instruction and MSR data, and the MSR values are reset according to 
procedures described in Section 5.4.1, "Reset Exceptions (x'(X)lOO')." 
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Chapter 3 

Addressing Modes and Instruction Set 

Summary 

This chapter describes instructions and address modes supported by the MPC601 
microprocessor. These instructions are divided into the following categories: 

Integer instructions — These include computational and logical instructions. 

Floating-point instructions — These include floating-point computational 
instructions, as well as instructions that affect the floating-point status and control 
register. 

Load/store instructions — ^These include integer and floating-point load and store 
instructions. 

Flow control instructions — ^These include branching instructions, condition register 
logical instructions, trap instructions, and other instructions that affect the 
instruction flow. 

Processor control instructions — ^These instructions are used for synchronizing 
memory accesses and management of caches, TLBs, and the segment registers. 

Note that this grouping of the instructions does not indicate which execution unit executes 
a particular instruction or group of instructions. This information, which is useful in taking 
full advantage of the MPC601 's superscalar parallel instruction execution, is provided with 
each instruction in Chapter 10, "Instruction Set." 

Integer instructions operate on byte, half-word, and word operands. Floating-point 
instructions operate on single-precision and double-precision floating-point operands. The 
PowerPC architecture uses instructions that are four bytes long and word-aligned. It 
provides for byte, half-word, and word operand fetches and stores between memory and a 
set of 32 general-purpose registers (GPRs). It also provides for word and double-word 
operand fetches and stores between memory and a set of 32 floating-point registers (FPRs). 

Computational instructions do not modify memory. To use a memory operand in a 
computation and then modify the same or another memory location, the memory contents 
must be loaded into a register, modified, and then written back to the target location. 

The MPC601 executes program instructions when it is in the normal execution state. 
However, the flow of instructions can be interrupted directly by the execution of an 
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instruction or by an asynchronous event. Either kind of exception may cause one of several 
components of the system software to be invoked. 

3.1 Memory Addressing 

A program references memory using the effective address computed by the processor when 
it executes a memory access or branch instruction, or when it fetches the next sequential 
instruction. 

3.1 .1 Effective Address Calculation 

The effective address is the 32-bit address computed by the processor when executing a 
memory access or branch instruction or when fetching the next sequential instruction. For 
a memory access instruction, if the sum of the effective address and the operand length 
exceeds the maximum effective address, the storage operand is considered to wrap around 
from the maximum effective address to effective address 0, as described in the following 
paragraphs. 

Effective address computations for both data and instruction accesses use 32-bit unsigned 
binary arithmetic. A carry from bit is ignored. 

Load and store operations have three categories of effective address generation: 

• Register indirect with immediate index mode. The d operand is added to the contents 
of the GPR specified by the rA operand to generate the effective address. 

• Register indirect with index mode. The contents of the GPR specified by rB operand 
are added to the contents of the GPR specified by the rA operand to generate the 
effective address, 

• Register indirect mode. The contents of the GPR specified by the rA operand are 
used as the effective address. 

Branch instructions have three categories of effective address generation: 

• Immediate addressing. The BD or LI operands are sign extended with the two low- 
order bits cleared to zero to generate the branch effective address. 

• Link register indirect. The contents of the link register with the two low-order bits 
cleared to zero are used as the branch effective address. 

• Counter register indirect. The contents of the counter register with the two low-order 
bits cleared to zero are used as the branch effective address. 

Branch instructions can optionally load the link register with the next sequential instruction 
address (current instruction address + 4). 

3.1.2 Context Synchronization 

The System Call (sc), Return from Interrupt (rfi), and Move to Machine State Register 
(mtmsr) instructions perform context synchronization by allowing previously issued 
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instructions to complete before performing a context switch. Execution of one of these 
instructions ensures the following: 

• No higher priority exception exists. 

• All previous instructions have completed to a point where they can no longer cause 
an exception. If a prior memory access instruction causes direct-store error 
exceptions, the results must be determined before this instruction is executed. 

• Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 

• The instructions following the so, rfi, and mtmsr instruction execute in the context 
established by these instruction. 

3.2 Exception Summary 

There are two kinds of exceptions in the MPC601 — those caused directly by the execution 
of an instruction and those caused by an asynchronous event. Either kind of exception 
causes one of several components of the system software to be invoked. 

Exceptions can be caused directly by the execution of an instruction in the following 
situations: 

• An attempt to execute an illegal instruction or an attempt by an application program 
to execute a supervisor-level instruction causes the illegal instruction or supervisor- 
level instruction handler to be invoked. 

• An attempt to access memory in a manner that violates memory protection causes 
the data access exception handler or instruction access exception handler to be 
invoked. 

• An attempt to access memory with an effective address alignment that is invalid for 
the instruction causes the alignment exception handler to be invoked. 

• The execution of an so instruction causes the system service program to be invoked. 

• The execution of a trap instruction that traps causes the program exception trap 
handler to be invoked. 

• The execution of a floating-point instruction when floating-point instructions are 
unavailable causes the floating-point unavailable handler to be invoked. 

• The execution of an instruction that causes a floating-point exception that is enabled 
causes the floating-point enabled exception handler to be invoked. 

Exceptions caused by asynchronous events are described in Chapter 5, "Exceptions". 
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3.3 Integer Instructions 

This section describes the integer instructions. These consist of the following: 

• Integer arithmetic instructions 

• Integer compare instructions 

• Integer rotate and shift instructions 

• Integer logical instructions. 

Integer instructions use the content of the GPRs as source operands and place results into 
GPRs, into the integer exception register (XER), and into condition register jEields. Trap 
instructions compare the contents of one GPR with a second GPR or with immediate data 
and, if the conditions are met, invoke the program exception trap handler. 

These instructions treat the source operands as signed integers unless the instruction is 
explicitly identified as an unsigned operation or an address conversion. 

The integer instructions that are coded to update the condition register and the integer 
logical and arithmetic instructions (addic, andi., and andis.) set condition register field 
CRO (bits 0-3) to characterize the result of the operation. The condition register field CRO 
is set as if result were compared algebraically to zero. 

The integer arithmetic instructions (addic, addic, subfic, addc, subfc, adde, subfe, 
addme, subfme, addze, and subfze) always set integer exception register bit CA to reflect 
the carry out of bit 0. Integer arithmetic instructions with the overflow enable (OE) bit set 
will cause the XER bits SO and OV to be set to reflect overflow of the 32-bit result. 

Unless otherwise noted, when condition register field CRO and the XER are affected they 
reflect the value placed in the target register. 

The MPC601 performs best for aligned load and store operations. See Section 5.4.6, 
"Alignment Exception (x'00600')," for scenarios that cause an alignment exception. 

3.3.1 Integer Arithmetic Instructions 

In the MPC601 instructions that select the overflow option (enable XER(OV)) or that set 
the integer exception register carry bit (CA may delay the execution of subsequent 
instructions. 

The MPC601 integer unit defines one additional register to the user register set and 
programming model that is not present in other PowerPC implementations. The MQ 
register is a 32-bit register whose primary use is to provide a register extension to 
accommodate the product for the MPC601 -specific Multiply (mul) instruction and the 
dividend for the MPC601 -specific Divide (dlv) instruction. It is also used as an operand of 
long rotate and shift instructions. 

The MQ register is never architecturally modified by any of the instructions defined in the 
PowerPC architecture. However, in the MPC601the MQ register may be modified during 
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the execution of any POWER or PowerPC multiply or divide instruction. The value written 
to the MQ register during these operations is operand dependent. 

Table 3-1 lists the integer arithmetic instructions for the MPC601. Note that some of the 
instructions are specific to the MPC601 implementation. 

Table 3-1 . Integer Arithmetic Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Add 
Immediate 


addi 


rD,rA,SlMM 


The sum (rA|0) + SIMM is placed into register rD. 


Add 

Immediate 

Shifted 


addis 


rD.rA.SIMM 


The sum (rA|0) + (SIMM || x '0000') is placed into register rD. 


Add 


add 
add. 
addo 
addo. 


rD,rA,rB 


The sum (rA) + (rB) is placed into register rD. 

add Add 

add. Add with CR Update. The dot suffix enables the update of 

the condition register. 
addo Add with Overflow Enabled. The o suffix enables the 

overflow bit (OV) in the XER. 
addo. Add with Overflow and CR Update. The o. suffix enables 

the update of the condition register and enables the 

overflow bit (OV) in the XER. 


Subtract 
from 


subf 
subf. 
subfo 
subfo. 


rD,rA,rB 


The sum ->(rA) + (rB) +1 is placed into rD. 

subf Subtract from 

subf. Subtract from with CR Update. The dot suffix enables the 

update of the condition register. 
subfo Subtract from with Overflow Enabled. The o suffix enables 

the overflow. The o suffix enables the overflow bit (OV) in 

the XER. 
subfo. Subtract from with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 


Add 

Immediate 

Carrying 


addic 


rD,rA,SIMM 


The sum (rA) + SIMM is placed into register rD. 


Add 

Immediate 
Carrying 
and Record 


addic. 


rD.rA.SIMM 


The sum (rA) + SIMM is placed into rD. The condition register is 
updated. 


Subtract 
from 

Immediate 
Carrying 


subfic 


rD.rA.SIMM 


The sum -i(rA) + SIMM + 1 is placed into register rD. 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Add 
Carrying 


addc 
addc. 
addco 
addco. 


rD,rA,rB 


The sum (rA) + (rB) is placed into register rD. 

addc Add Carrying 

addc. Add Carrying with CR Update. The dot suffix enables the 

update of the condition register. 
addco Add Carrying with Overflow Enabled. The o suffix enables 

the overflow bit (OV) in the XER. 
addco. Add Carrying with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 


Subtract 

from 

Carrying 


subfc 
subfc. 
subfco 
subfco. 


rD.rA.rB 


The sum -.(rA) + (rB) + 1 is placed into register rD. 

subfc Subtract from Carrying 

subfc. Subtract from Carrying with CR Update. The dot suffix 

enables the update of the condition register. 
subfco Subtract from Carrying with Overflow. The o suffix enables 

the overflow bit (OV) in the XER. 
subfco. Subtract from Carrying with Overflow and CR Update. 

The 0. suffix enables the update of the condition register 

and enables the overflow bit (OV) in the XER. 


Add 
Extended 


adde 
adde. 
addeo 
addeo. 


rD,rA,rB 


The sum (rA) + (rB) + XER(CA) is placed into register rD. 

adde Add Extended 

adde. Add Extended with CR Update. The dot suffix enables the 

update of the condition register. 
addeo Add Extended with Overflow. The o suffix enables the 

overflow bit (OV) in the XER. 
addeo. Add Extended with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 


Subtract 

from 

Extended 


subfe 
subfe. 
subfeo 
subfeo. 


rD,rA,rB 


The sum -.(rA) + (rB) + XER(CA) is placed into register rD. 

subfe Subtract from Extended 

subfe. Subtract from Extended with CR Update. The dot suffix 

enables the update of the condition register. 
subfeo Subtract from Extended with Overflow. The o suffix 

enables the overflow bit (OV) in the XER. 
subfeo. Subtract from Extended with Overflow and CR Update. 

The 0. suffix enables the update of the condition register 

and enables the overflow (OV) bit in the XER. 


Add to 
Minus One 
Extended 


addme 
addme. 
addmeo 
addmeo. 


rD,rA 


The sum (rA) + XER(CA) + x'FFFFFFFF' is placed into register rD. 

addme Add to Minus One Extended 

addme. Add to Minus One Extended with CR Update. The dot 

suffix enables the update of the condition register. 
addmeo Add to Minus One Extended with Overflow. The o suffix 

enables the overflow bit (OV) in the XER. 
addmeo. Add to Minus One Extended with Overflow and CR 

Update. The o. suffix enables the update of the condition 

register and enables the overflow (OV) bit in the XER. 
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Table 3-1. Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Subtract 
from Ivlinus 
One 
Extended 


subfme 
subfme. 
subfmeo 
subfmeo. 


rD.rA 


The sum-,(rA) + XER(CA) + x'FFFFFFFF'is placed into register rD. 

subfme Subtract from IVIinus One Extended 

subfme. Subtract from Minus One Extended with CR Update. The 

dot suffix enables the update of the condition register. 
subfmeo Subtract from Minus One Extended with Overflow. The o 

suffix enables the overflow bit (OV) in the XER. 
subfmeo. Subtract from Minus One Extended with Overflow and CR 

Update. The o. suffix enables the update of the condition 

register and enables the overflow bit (OV) in the XER. 


Add to Zero 
Extended 


addze 
addze. 
addzeo 
addzeo. 


rD.rA 


The sum (rA) + XER(CA) is placed into register rD. 

addze Add to Zero Extended 

addze. Add to Zero Extended with CR Update. The dot suffix 

enables the update of the condition register. 
addzeo Add to Zero Extended with Overflow. The o suffix enables 

the overflow bit (OV) in the XER. 
addzeo. Add to Zero Extended with Overflow and CR Update. The 

0. suffix enables the update of the condition register and 

enables the overflow bit (OV) in the XER. 


Subtract 
from Zero 
Extended 


subfze 
subfze. 
subfzeo 
subfzeo. 


rD.rA 


The sum -,(rA) + XER(CA) is placed into register rD. 

subfze Subtract from Zero Extended 

subfze. Subtract from Zero Extended with CR Update. The dot 

suffix enables the update of the condition register. 
subfzeo Subtract from Zero Extended with Overflow. The o suffix 

enables the overflow bit (OV) in the XER. 
subfzeo. Subtract from Zero Extended with Overflow and CR 

Update. The o. suffix enables the update of the condition 

register and enables the overflow bit (OV) in the XER. 


Negate 


neg 
neg. 
nego 
nego. 


rD.rA 


The sum -i(rA) + 1 is placed into register rD. 

neg Negate 

neg. Negate with CR Update. The dot suffix enables the update 

of the condition register. 
nego Negate with Overflow. The o suffix enables the overflow bit 

(OV) in the XER. 
nego. Negate with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 


t^ultiply 

Low 

Immediate 


mull! 


rD.rA.Sif^fvl 


The low-order 32 bits of the 48-bit product (rA)*SIMM are placed into 
register rD. The low-order 32 bits of the product are the correct 32-bit 
product. The low-order bits are independent of whether the operands 
are treated as signed or unsigned integers. However, XER[OV] is set 
based on the result interpreted as a signed integer. 
The high-order bits are lost. This instruction can be used with 
mulhwxto calculate a full 64-bit product. 
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Table 3-1. Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Multiply 
Low 


mullw 
mullw. 
mullwo 
mullwo. 


rD.rA.rB 


The low-order 32 bits of the 64-bit product (rA)*(rB) are placed into 
register rD. The low-order 32 bits of the product are the correct 32-bit 
product. The low-order bits are independent of whether the operands 
are treated as signed or unsigned integers. However, XERfOV] is set 
based on the result interpreted as a signed integer. 

The high-order bits are lost. This instruction can be used with 
mulhwxto calculate a full 64-bit product. Some implementations may 
execute faster if rB contains the operand having the smaller absolute 
value. 

mullw Multiply Low 

mullw. Multiply Low with CR Update. The dot suffix enables the 

update of the condition register. 
mullwo Multiply Low with Overflow. The o suffix enables the 

overflow bit (OV) in the XER. 
mullwo. Multiply Low with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 


Multiply 
High Word 


mulhw 
mulhw. 


rD,rA,rB 


The contents of rA and rB are interpreted as 32-bit signed integers. 
The 64-bit product is formed. The high-order 32 bits of the 64-bit 
product are placed into rD. 

Both operands and the product are interpreted as signed integers. 

This instruction may execute faster if rB contains the operand having 
the smaller absolute value. 

mulhw Multiply High Word 

mulhw. Multiply High Word with CR Update. The dot suffix enables 
the update of the condition register. 


Multiply 
High Word 
Unsigned 


mulhwu 
mulhwu. 


rD,rA,rB 


The contents of rA and of rB are extracted and interpreted as 32-bit 
unsigned integers. The 64-bit product is formed. The high-order 32 
bits of the 64-bit product are placed into rD. 

Both operands and the product are interpreted as unsigned integers. 

This instruction may execute faster if rB contains the operand having 
the smaller absolute value. 

mulhwu Multiply High Word Unsigned 

mulhwu. Multiply High Word Unsigned with CR Update. The dot 
suffix enables the update of the condition register. 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 



Divide Word 



Mnemonic 



divw 
divw. 
divwo 
divwo. 



Operand 
Syntax 



rD,rA,rB 



Operation 



The dividend is the signed value of (rA). The divisor is the signed 
value of (rB). The 64-bit quotient is formed. The low-order 32 bits of 
the 64-bit quotient are placed into rD. The remainder is not supplied 
as a result. 

Both operands are interpreted as signed integers. The quotient is the 
unique signed integer that satisfies the following: 

dividend = (quotient times divisor) + r 

where < r < |divisor| if the dividend is non-negative, and -|divisor| < 

r < if the dividend is negative. 

If an attempt is made to perform any of the divisions 

x'8000_0000' / -1 
or 
<anything> / 

the contents of register rD are undefined, as are the contents of the 
LT, GT, and EQ bits of the condition register field CRO if the 
instruction has condition register updating enabled, in these cases, if 
instruction overflow is enabled, then XER[OV] is set. 
The 32-bit signed remainder of dividing (rA) by (rB) can be computed 
as follows, except in the case that (rA) = -2^^ and (rB) = -1 : 

divw rD,rA,rB rD = quotient 

mull rD,rD,rB' rD = quotient*divisor 

subf rD,rD,rA rD = remainder 

divw Divide Word 

divw. Divide Word with CR Update. The dot suffix enables the 

update of the condition register. 
divwo Divide Word with Overflow. The o suffix enables the overflow 

bit (OV) in the XER. 
divwo. Divide Word with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables the 

overflow bit (OV) in the XER. 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Divide 

Word 

Unsigned 


divwu 
divwu. 
divwuo 
divwuo. 


rD,rA,rB 


Thie dividend is the value of (rA). The divisor is the value of (rB). The 

32 bit quotient is placed into rD. The remainder is not supplied as a 

result. 

Both operands are interpreted as unsigned integers. The quotient is 

the unique unsigned integer that satisfies the following: 

dividend = (quotient times divisor) + r 

where < r < divisor. 

If an attempt is made to perform the division 

<anything> / 

the contents of register rD are undefined, as are the contents of the 
LT, GT, and EQ bits of the condition register field CRO if the 
instruction has the condition register updating enabled. In these 
cases, if instruction overflow is enabled, then XER[OV] is set. 

The 32-bit unsigned remainder of dividing (rA) by (rB) can be 
computed as follows: 

divwu rD,rA,rB rD = quotient 

mull rD.rD.rB rD = quotient*divisor 

subf rD,rD,rA rD = remainder 

divwu Divide Word Unsigned 

divwu. Divide Word Unsigned with CR Update. The dot suffix 

enables the update of the condition register. 
divwuo Divide Word Unsigned with Overflow. The o suffix enables 

the overflow bit (OV) in the XER. 
divwuo. Divide Word Unsigned with Overflow and CR Update. The 

0. suffix enables the update of the condition register and 

enables the overflow bit (OV) in the XER. 


Difference 
or Zero 
Immediate 


dozi 


rD.rA.SilVilVI 


This is a POWER instruction, and is not part of the PowerPC 
arciiitecture. This instruction will not be supported by other 
PowerPC implementations. 

The sum -n(rA) + SIMIVI + 1 is placed into register rD. If the value in 
register rA is algebraically greater than the value of the SIMM field, 
register rD is cleared. 

This instruction is specific to the MPC601 . 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Difference 
or Zero 


doz 
doz. 
doze 
dozo. 


rD,rA,rB 


This is a POWER instruction, and Is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

The sum -. (rA) + (rB) + 1 is placed into register rD. If the value in 
register rA is algebraically greater than the value in register rB, 
register rD is cleared. 

If the instruction has condition register updating enabled, condition 
register field CRO is set to reflect the result placed in register rD (i.e., 
if register rD is set to zero, EQ is set to 1 ). 

If the instruction has overflow enabled, XER[OV] is only set on 
positive overflows. 

doz Difference or Zero 

doz. Difference or Zero with CR Update. The dot suffix enables 

the update of the condition register. 
dozo Difference or Zero with Overflow. The o suffix enables the ' 

overflow bit (OV) in the XER. 
dozo. Difference or Zero with Overflow and CR Update. The o. 

suffix enables the update of the condition register and 

enables the overflow bit (OV) in the XER. 
This instruction is specific to the MPC601 . 


Absolute 


abs 
abs. 
abso 
abso. 


rD.rA 


This is a POWER Instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

The absolute value |(rA)| is placed into register rD. If register rA 
contains the most negative number (i.e., x '80000000'), the result of 
the instruction is the most negative number and sets the XER[OV] bit 
if enabled. 

abs Absolute 

abs. Absolute with OR Update. The dot suffix enables the 

update of the condition register. 
abso Absolute with Overflow. The o suffix enables the overflow 

bit (OV) in the XER 
abso. Absolute with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 
This instruction is specific to the fv1PC601 . 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Negative 
Absolute 


nabs 
nabs, 
nabso 
nabso. 


rD.rA 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

The negative absolute value -|(rA)| is placed into register rD. 

Note: nabs never overflows. If the instruction is overflow enabled, 
then XER[OV] is cleared to zero and XER[SO] is not changed. 

nabs Negative Absolute 

nabs. Negative Absolute with CR Update. The dot suffix enables 

the update of the condition register. 
nabso Negative Absolute with Overflow. The o suffix enables the 

overflow bit (OV) in the XER 
nabso. Negative Absolute with Overflow and CR Update. The o. 

suffix enables the update of the condition register and 

enables the overflow bit (OV) in the XER. 
This instruction is specific to the MPC601 . 


Multiply 


mul 
mul. 
mulo 
mulo. 


rD,rA,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Bits 0-31 of the product (rA)*(rB) are placed into register rD. Bits 
32-63 of the product (rA)*(rB) are placed into the MQ register. 

If the condition register updating is enabled, then LT, GT, and EG 
reflect the result in the MQ register's low-order 32 bits. If the 
instruction is overflow enabled, then the XER[SO] and XERfOV] bits 
are set to one if the product cannot be represented in 32 bits. 

mul r^ultiply 

mul. f\^ultiply with CR Update. The dot suffix enables the update 

of the condition register. 
mulo Multiply with Overflow. The o suffix enables the overflow 

bit (OV) in the XER. 
mulo. Multiply with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 
This instruction is specific to the MPC601 . 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 


Name 


Mnemonic 


Operand 
Syntax 


Operation 


Divide 


div 
div. 
divo 
divo. 


rD,rA,rB 


Ttiis is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction wiii not be supported by other 
PowerPC implementations. 

The quotient [(rA) || (MQ)]/(rB) is placed into register rD. The 
remainder is placed in the MQ register. The remainder has the same 
sign as the dividend, except that a zero quotient or a zero remainder 
is always positive. The results obey the equation: 

dividend = (divisor * quotient) + remainder 

where dividend is the original (rA) || (MQ), divisor is the original (rB), 
quotient is the final (rD), and remainder is the final (MQ). 

If the condition register updating is enabled, condition register field 
CRO bits LT, GT, and EQ reflect the remainder. If the instruction is 
overflow enabled, then the XER[SO] and XER[OV] bits are set to one 
if the quotient cannot be represented in 32 bits. 

For the case of -2^V-1 , the MQ register is cleared to zero and -2^^ is 
placed in register rD. For all other overflows, (MQ), (rD), and 
condition register field CRO (if condition register updating is enabled) 
are undefined. 

div Divide 

div. Divide with OR Update. The dot suffix enables the update 

of the condition register. 
divo Divide with Overflow. The o suffix enables the overflow bit 

(OV) in the XER. 
divo. Divide with Overflow and OR Update. The o. suffix enables 

the update of the condition register and enables the 

overflow bit (OV) in the XER. 
This instruction is specific to the MPC601. 
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Table 3-1 . Integer Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Divide Short 


divs 
divs. 
divso 
divso. 


rD,rA,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

The quotient (rA)/(rB) is placed into register rD. The remainder is 
placed in MQ. The remainder has the same sign as the dividend, 
except that a zero quotient or a zero remainder is always positive. 
The results obey the equation: 

dividend = (divisor * quotient) + remainder 

where the dividend is the original (rA), divisor is the original (rB), 
quotient is the final (rD), and remainder is the final (MQ). 

If the condition register updating is enabled, then the condition 
register field CRO bits LT, EQ, and GT reflect the remainder. If the 
instruction is overflow enabled, then the XER[SO] and XER[OV] bits 
are set to one if the quotient cannot be represented in 32 bits (e.g., as 
is the case when the divisor is zero, or the dividend is -2"^'' and the 
divisor is -1). For the case of -2^V-1 , the MQ register is cleared and - 
2^'' is placed in register rD. For all other overflows, (MQ), (rD), and 
condition register field CRO (if condition register updating is enabled) 
are undefined. 

divs Divide Short 

divs. Divide Short with CR Update. The dot suffix enables the 

update of the condition register. 
divso Divide Short with Overflow. The o suffix enables the 

overflow bit (OV) in the XER. 
divso. Divide Short with Overflow and CR Update. The o. suffix 

enables the update of the condition register and enables 

the overflow bit (OV) in the XER. 
This instruction is specific to the MPC601 . 



In addition to supporting all of the PowerPC integer arithmetic instructions, the MPC601 
supports the POWER arithmetic instructions summarized in Table 3-1 and Table 3-2 and 
described in detail in Chapter 10, "Instruction Set." Note that in order to achieve full 
compatibility with future PowerPC implementations, it is up to software to either emulate 
these operations in the program exception handler, or to completely avoid their use. 

Table 3-2. MPC601 -Specific Integer Arithmetic Instruction Summary 



Mnemonic 


Instruction Name 


dozi 


Difference or Zero Immediate 


dozx 


Difference or Zero 


absx 


Absolute 


nabsx 


Negative Absolute 


mulx 


Multiply 


divx 


Divide 


divsx 


Divide Short 
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3.3.2 Integer Compare Instructions 

The integer compare instructions algebraically or logically compare the contents of register 
rA with either the UIMM operand, the SIMM operand or the contents of register rB. 
Algebraic comparison compares two signed integers. Logical comparison compares two 
unsigned numbers 

The L field specifies whether the operands are treated as 32- or 64 bit values. The simplified 
mnemonics for integer compare instructions are shown in Table 3-4 correctly clear the L 
value in the instruction rather than requiring it to be coded as a numeric operand. 

Table 3-3. Integer Compare Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Compare 
Immediate 


cmpi 


crfD,L,rA,SIMM 


The contents of register rA is compared with the sign-extended 
value of the SIN/lfvl operand, treating the operands as signed 
integers. The result of the comparison is placed into the OR field 
specified by operand crfD. 


Compare 


cmp 


crfD,L,rA,rB 


The contents of register rA is compared with register rB, treating 
the operands as signed integers. The result of the comparison is 
placed into the OR field specified by operand crfD. 


Compare 

Logical 

Immediate 


cmpli 


crfD,L,rA,UIMM 


The contents of register rA is compared with x'OOOO' || UlfVlfvl, 
treating the operands as unsigned integers. The result of the 
comparison is placed into the CR field specified by operand crfD. 


Compare 
Logical 


cmpI 


crfD,L,rA,rB 


The contents of register rA is compared with register rB, treating 
the operands as unsigned integers. The result of the comparison is 
placed into the CR field specified by operand crfD. 



The crfD field can be omitted if the result of the comparison is to be placed in CRO. 
Otherwise the target CR field must be specified in the instruction crfD field, using one of 
the CR field symbols (CR()-CR7) or an explicit field number. 

The instructions listed in Table 3-4 are simplified mnemonics supported in all PowerPC 
implementations that provide compare word capability for 32-bit operands. 

Table 3-4. Word Compare Simplified Mnemonics 



Operation 


Simplified Mnemonic 


Equivalent to: 


Compare Word Immediate 


cmpwi crfD.rA.SIMIVl 


cmpi crfD,0,rA,SIMM 


Compare Word 


cmpw crfD,rA,rB 


cmp crfD,0,rA,rB 


Compare Logical Word 
Immediate 


cmpiwi crfD.rA.UIMM 


cmpli crfD,0,rA,UIIVIM 


Compare Logical Word 


cmplw crfD.rA.rB 


cmp! crfD,0,rA,rB 



The following examples demonstrate the use of the word compare mnemonics as a way to 
simplify instruction coding: 
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1. Compare 32 bits in register rA with immediate value 100 and place result in 
condition register field CRO. 

cmpwi rA,100 (equivalent to cmpi 0,0,rA,100) 

2. Same as (1), but place results in condition register field CR4. 
cmpwi cr4,rA,100 (equivalent to cmpi 4,0,rA,100) 

3. Compare registers rA and rB as logical 32-bit quantities and place result in 
condition register field CRO. 

cmplw rA,rB (equivalent to cmpi 0,0,rA,rB) 

3.3.3 Integer Logical Instructions 

The logical instructions shown in Table 3-5 perform bit-parallel operations. Logical 
instructions with the condition register update enabled and instructions andi. and andis. set 
condition register field CRO to characterize the result of the logical operation. These fields 
are set as if the sign-extended low-order 32 bits of the result were algebraically compared 
to zero. Logical instructions without condition register update and the remaining logical 
instructions do not modify the condition register. Logical instructions do not change the 
XER[SO], XER[OV], and XER[CA] bits. 

Table 3-5. Integer Logical Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


AND 
Immediate 


andi. 


rA,rS,UIMM 


The contents of rS is ANDed with x'OOOO' || Ull^t^ and the result is 
placed Into rA. 


AND 

Immediate 

Shifted 


andis. 


rA.rS.UIMM 


The contents of rS is ANDed with Ulf^M || x'OOOO' and the result is 
placed into rA. 


OR 
Immediate 


ori 


rA,rS,UIMM 


The contents of rS is ORed with x'OOOO' || UIMM and the result is 
placed into rA. 

The preferred no-op is ori 0,0,0 


OR 

Immediate 

Shifted 


oris 


rA,rS,UIMM 


The contents of rS is ORed with UIMM ||x'0000' and the result is 
placed Into rA. 


XOR 
Immediate 


xori 


rA,rS,UII^M 


The contents of rS is XORed with x'OOOO' || UIMM and the result is 
placed into rA. 


XOR 

Immediate 

Shifted 


xoris 


rA.rS.UIMM 


The contents of rS is XORed with UIMM IJx'OOOO' and the result is 
placed into rA. 


AND 


and 
and. 


rA.rS.rB 

1 


The contents of rS is ANDed with the contents of register rB and the 
result is placed into rA. 

and AND 

and. AND with OR Update. The dot suffix enables the update of 
the condition register. 
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Table 3-5. Integer Logical Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


OR 


or 
or. 


rA.rS.rB 


The contents of rS is ORed with the contents of rB and the result is 
placed into rA. 

or OR 

or. OR with CR Update. The dot suffix enables the update of the 
condition register. 


XOR 


xor 
xor. 


rA.rS.rB 


The contents of rS is XORed with the contents of rB and the result is 
placed into register rA. 

xor XOR 

xor. XOR with CR Update. The dot suffix enables the update of 
the condition register. 


NAND 


nand 
nand. 


rA.rS.rB 


The contents of rS is ANDed with the contents of rB and the one's 
complement of the result is placed into register rA. 

nand NAND 

nand. NAND with CR Update. The dot suffix enables the update of 

the condition register. 
NAND with rA=rB can be used to obtain the one's complement. 


NOR 


nor 
nor. 


rA.rS.rB 


The contents of rS is ORed with the contents of rB and the one's 
complement of the result is placed into register rA. 

nor NOR 

nor. NOR with CR Update. The dot suffix enables the update of 

the condition register. 
NOR with rA=rB can be used to obtain the one's complement. 


Equivalent 


eqv 
eqv. 


rA,rS,rB 


The contents of rS is XORed with the contents of rB and the 
complemented result is placed into register rA. 

eqv Equivalent 

eqv. Equivalent with CR Update, The dot suffix enables the 
update of the condition register. 


AND with 
Complement 


andc 
andc. 


rA.rS.rB 


The contents of rS is ANDed with the complement of the contents of 
rB and the result is placed into rA. 

ande AND with Complement 

andc. AND with Complement with CR Update. The dot suffix 
enables the update of the condition register. 


OR with 
Complement 


ore 
ore. 


rA.rS.rB 


The contents of rS is ORed with the complement of the contents of rB 
and the result is placed into rA. 

ore OR with Complement 

ore. OR with Complement with CR Update. The dot suffix 
enables the update of the condition register. 


Extend Sign 


extsb 
oxtcb. 


rA.rS 


Register r S[24-31] are placed into rA[24-31]. Bit 24 of rS is placed 
into rA[0-23]. 

extsb Extend Sign Byte 

extsb. Extend Sign Byte with CR Update. The dot suffix enables the 
update of the condition register. 
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Table 3-5. Integer Logical Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Extend Sign 
Half Word 


extsh 
extsh. 


rA.rS 


Register r S[1 6-31 ] are placed into rA[1 6-31 ]. Bit 1 6 of rS is placed 
intorA[0-15]. 

extsh Extend Sign Half Word 

extsh. Extend Sign Half Word with CR Update. The dot suffix 
enables the update of the condition register. 


Count 
Leading 
Zeros Word 


cntlzw 
cntlzw. 


rA.rS 


A count of the number of consecutive zero bits of rS is placed into rA. 
This number ranges from to 32, inclusive. 

cntlzw Count Leading Zeros Word 

cntlzw. Count Leading Zeros Word with CR Update. The dot suffix 

enables the update of the condition register. 
When the Count Leading Zeros Word instruction has condition 
register updating enabled, the LT field is cleared to zero in CRO. 



3.3.4 Integer Rotate and Shift instructions 

Rotate and shift instructions provide powerful and general ways to manipulate register 
contents. Simplified mnemonics allow some of the simpler operations to be coded easily. 
Mnemonics are provided for the types of operation shown in Table 3-6. 

Table 3-6. Rotate and Shift Operations 



Operation 


Description 


Extract 


Select a field of n bits starting at bit position b in the source register, right or left justify this field in the 
target register, and clear all other bits of the target register to zero. 


Insert 


Select a left- or right-justified field of n bits in the source register, insert this field starting at bit position 
b of the target register, and leave other bits of the target register unchanged. (No simplified mnemonic 
is provided for insertion of a left-justified field when operating on double-words; such an insertion 
requires more than one instruction.) 


Rotate 


Rotate the contents of a register right or left n bits without masking. 


Shift 


Shift the contents of a register right or left n bits, clearing vacated bits to (logical shift). 


Clear 


Clear the leftmost or rightmost n bits of a register to 0. 


Clear left 
and shift 
left 


Clear the leftmost b bits of a register, then shift the register left by n bits. This operation can be used to 
scale a known non-negative array index by the width of an element. 



The lU performs rotation operations on data from a GPR and returns the result, or a portion 
of the result, to a GPR. Rotation operations rotate a 32-bit quantity left by a specified 
number of bit positions. Bits that exit from position enter at position 31. 

Rotate and shift instructions employ a mask generator. The mask is 32 bits long and consists 
of 1-bits from a start bit, MB, through and including a stop bit, ME, and O-bits elsewhere. 
The values of MB and ME range from zero to 3 1 . If MB > ME, the 1 -bits wrap around from 
position 3 1 to position 0. Thus the mask is formed as follows: 
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if MB < ME then 

mask[mstart-iTistop] = ones 
mask[all other bits] = zeros 
else 

mask[mstart-31] = ones 
mask[()-mstop] = ones 
mask[all other bits] = zeros 

There is no way to specify an ail-zero mask. The use of the mask is described in the 
following sections. 

If condition register updating is enabled, rotate and shift instructions set condition register 
field CRO according to the contents of rA at the completion of the instruction. Rotate and 
shift instructions do not change the values of XER[OV] and XER[SO] bits. Rotate and shift 
instructions, except algebraic right shifts, do not change the XER[CA] bit. 

Simplified mnemonics allow simpler coding of often-used functions such as clearing the 
leftmost or rightmost bits of a register, left justifying or right justifying an arbitrary field, 
and simple rotates and shifts. Some of these are shown as examples with the rotate 
instructions. 

POWER Compatibility Note: In addition to supporting all of the PowerPC integer rotate 
and shift instructions, the MPC601 also supports all POWER rotate and shift instructions. 
Note that in order to achieve full compatibility with all POWER appUcations on future 
PowerPC implementations, it is left up to software to either emulate these operations in the 
instruction exception handler, or to completely avoid their use. These MPC601 -specific 
rotate and shift instructions are summarized in Table 3-7. 

Table 3-7. MPC601 -Specific Rotate and Shift Instructions 



Mnemonic 


Instruction Name 


rimix 


Rotate Left then Mask Insert 


rribx 


Rotate Right and Insert Bit 


maskgx 


Mask Generate 


maskirx 


Mask Insert from Register 


slqx 


Shift Left with MQ 


srqx 


Shift Right with MQ 


sliqx 


Shift Left Immediate with MQ 


slliqx 


Shift Left Long Immediate with MQ 


sriqx 


Shift Right Immediate with MQ 


srliqx 


Shift Right Long Immediate with MQ 


sllqx 


Shift Left Long with MQ 
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Table 3-7. MPC601 -Specific Rotate and Shift Instructions (Continued) 



1 

Mnemonic 


Instruction Name 


sriqx 


Shift Right Long with MQ 


slex 


Shift Left Extended 


sleqx^ 


Shift Left Extended with MQ 


srex 


Shift Right Extended 


sreqx 


Shift Right Extended with MQ 


sraiqx 


Shift Right Algebraic Immediate with MQ 


sraqx 


Shift Right Algebraic with MQ 


sreax 


Shift Right Extended Algebraic 



3.3.4.1 Integer Rotate Instructions 

Integer rotate instructions rotate the contents of a register. The result of the rotation is 
inserted into the target register under control of a mask (if a mask bit is 1 the associated bit 
of the rotated data is placed into the target register, and if the mask bit is the associated 
bit in the target register is unchanged), or ANDed with a mask before being placed into the 
target register. 

Rotate left instructions allow right-rotation of the contents of a register to be performed by 
a left-rotation of 32-n, where n is the number of bits by which to rotate right 

3.3.4.2 Integer Shift Instructions 

The instructions in this section perform left and right shifts. Immediate-form logical 
(unsigned) shift operations are obtained by specifying masks and shift values for certain 
rotate instructions. Simplified mnemonics are provided to make coding of such shifts 
simpler and easier to understand. 

Any shift right algebraic instruction, followed by addze, can be used to divide quickly by 

2". 

Multiple-precision shifts can be programmed as shown in Appendix E, "Multiple-Precision 
Shifts." 
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The integer rotate and shift instructions are summarized in Table 3-8. 







Table 3-8. Integer Rotate Instructions 


Name 


Mnemonic 


Operand Syntax 


Operation 


Rotate Left 
Word 
immediate 
then AND 
with Mask 


rlwinm 
rlwinm. 


rA,rS,SH,MB,ME 


The contents of register rS are rotated left by the number of bits 
specified by operand SH. A mask is generated having 1-bits from 
the bit specified by operand MB through the bit specified by 
operand ME and 0-bits elsewhere. The rotated data is ANDed with 
the generated mask and the result is placed into register rA. 

rlwinm Rotate Left Word Immediate then AND with Mask 
rlwinm. Rotate Left Word Immediate then AND with Mask with 

CR Update. The dot suffix enables the update of the 

condition register. 
Simplified mnemonics: 
extiwi rA,rS,n,b rlwinm rA,rS,b,0,n-1 
srwi rA,rS,n rlwinm rA,rS,32-n,n,31 
cirrwi rA,rS,n rlwinm r A,rS,0,0,31 -n 
Note: The rlwinm instruction can be used for extracting, clearing 
and shifting bit fields using the methods shown below: 

To extract an n-bit field that starts at bit position b in register rS, ^ 
right-justified into rA (clearing the remaining 32-n bits of rA), set 
SH=b+n, MB=32-n, and ME=31 . 

To extract an n-bit field that starts at bit position b in rS, left- 
justified into rA, set SH=i5, MB = 0, and ME=n-1 . 

To rotate the contents of a register left (right) by nbits, set SH=n 
(32-n), MB=0, and ME=31. 

To shift the contents of a register right by n bits, set SH=32-n, 
MB=n, andME=31. 

To clear the high-order b bits of a register and then shift the result 
left by nbits, set SH=n, MB=fa-nand ME=31-n. 

To clear the low-order n bits of a register, set SH=0, MB=0, and 
ME=31-n. 
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Table 3-8. Integer Rotate Instructions (Continued) 



Name 


Mnemonic 


Operand Syntax 


Operation 


Rotate Left 
Word then 
AND with 
Mask 


rlwnm 
rlwnm. 


rA,rS,rB,MB,ME 


The contents of rS are rotated left by the number of bits specified 
by rB[27-31]. A mask is generated having 1 -bits from the bit 
specified by operand MB through the bit specified by operand ME 
and 0-bits elsewhere. The rotated data is ANDed with the 
generated mask and the result is placed into rA. 

rlwinm Rotate Left Word then AND with Mask 

rlwinm. Rotate Left Word then AND with Mask with CR Update. 

The dot suffix enables the update of the condition 

register. 
Simplified mnemonics: 
rollw rA.rS.rB rlwnm rA,rS,rB,0,31 

Note: The rlwinm instruction can be used to extract and rotate bit 
fields using the methods shown below: 

To extract an n-bit field that starts at the variable bit position b in 

the register specified by operand rS, right-justified into rA (clearing 

the remaining 32-nbits of rA), set r B[27-31]=ti+n, MB=32-n, and 

ME=31. 

To extract an n-bit field that starts at variable bit position b in the 

register specified by operand rS, left-justified into rA (clearing the 

remaining 32-n bits of rA), set rB[27-31]=ij, MB = 0, and ME=n-1 . 

To rotate the contents of the low-order 32 bits of a register left 
(right) by variable n bits, set rB[27-31]=n (32-n), MB=0, and 
ME=31. 


Rotate Left 
Word 
Immediate 
then Mask 
Insert 


rlwimi 
rlwimi. 


rA,rS,SH,MB,ME 


The contents of rS are rotated left by the number of bits specified 
by operand SH. A mask is generated having 1 -bits from the bit 
specified by MB through the bit specified by ME and 0-bits 
elsewhere. The rotated data is inserted into rA under control of the 
generated mask. 

rlwimi Rotate Left Word Immediate then Mask 

rlwimi. Rotate Left Word Immediate then Mask Insert with CR 

Update. The dot suffix enables the update of the 

condition register. 
Simplified mnemonic: 

inslw rA,rS,n,b rlwim rA,rS,32-b,b,i»+n-1 

Note: The opcode rlwimi can be used to insert a bit field into the 
contents of register specified by operand rA using the methods 
shown below: 

To insert an n-bit field that is left-justified in rS into rA starting at bit 
position b, set SH=32-/j, MB=fa, and ME=(d+n)-1. 

To insert an n-bit field that is right-justified in rS into rA starting at 
bit position b, set SH=32-(b+n), MB=to, and ME=(b+n)-1 . 
Simplified mnemonics are provided for both of these methods. 
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Table 3-8. Integer Rotate Instructions (Continued) 



Name 


Mnemonic 


Operand Syntax 


Operation 


Rotate Left 
then Mask 
Insert 


rimi 
rimi. 


rA,rS,rB,MB,ME 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

The contents of rS is rotated left the number of positions specified 
by bits 27-31 of rB. The rotated data is inserted into rA under 
control of the generated mask. 

rimi Rotate Left then Mask Insert 

rimi. Rotate Left then Mask Insert with OR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Rotate 
Right and 
Insert Bit 


rrib 
rrib. 


rA,rS,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Bit of rS is rotated right the amount specified by bits 27-31 of rB. 
The bit is then inserted into rA. 

rrib Rotate Right and Insert Bit 

rrib. Rotate Right and Insert Bit with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Mask 
Generate 


maskg 
maskg. 


rA.rS.rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Let mstart = rS[27-31], specifying the starting point of a mask of 
ones. Let mstop = rB[27-31], specifying the end point of the mask 
of ones. 

If mstart < mstop+1 then 

MASK(mstarl... mstop) = ones 

MASK(all other bits) = zeros 
If mstart = mstop+1 then 

MASK(0-31) = ones 
If mstart > mstop+1 then 

MASK(mstop+1 ...mstart-1) = zeros 

MASK(all other bits) = ones 

MASK is then placed in rA. 
maskg Mask Generate 
maskg. Mask Generate with CR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPG601. 


Mask 
Insert from 
Register 


maskir 
maskir. 


rA.rS.rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is inserted into rA under ccntrcl cf tho mask in rB. 

maskir Mask Insert from Register 

maskir. Mask Insert from Register with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPG601 . 
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The integer shift instructions are summarized in Table 3-9. 

Table 3-9. Integer Shift Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Left 
Word 


slw 
slw. 


rA,rS,rB 


The contents of rS are shifted left the number of bits specified by 
rB[26-31]. Bits shifted out of position are lost. Zeros are supplied to 
the vacated positions on the right. The 32-bit result is placed into rA. 

If rB[26]=1 , then rA is filled with zeros. 

slw Shift Left Word 

slw. Shift Left Word with CR Update. The dot suffix enables 
the update of the condition register. 


Shift Right 
Word 


srw 
srw. 


rA.rS.rB 


The contents of rS are shifted right the number of bits specified by 
rB[26-31]. Zeros are supplied to the vacated positions on the left. 
The 32-bit result is placed into rA. 

If rB[26]=1 , then rA is filled with zeros. 

srw Shift Right Word 

srw. Shift Right Word with CR Update. The dot suffix enables 
the update of the condition register. 


Shift Right 
Algebraic 
Word 
Immediate 


srawi 
srawi. 


rA,rS,SH 


The contents of rS are shifted right the number of bits specified by 
operand SH. Bits shifted out of position 31 are lost. The 32-bit result 
is sign extended and placed into rA. XER[CA] is set if r S contains a 
negative number and any 1 -bits are shifted out of position 31 ; 
otherwise XER(CA) is cleared. An operand SH of zero causes rAto 
be loaded with the contents of rS and XER[CA] to be cleared to 0. 

srawi Shift Right Algebraic Word Immediate 

srawi. Shift Right Algebraic Word Immediate with CR Update. 

The dot suffix enables the update of the condition register. 


Shift Right 

Algebraic 

Word 


sraw 
sraw. 


rA.rS.rB 


The contents of rS are shifted right the number of bits specified by 
rB[26-31]. The 32-bit result is placed into rA. XER[CA] is set to 1 if rS 
contains a negative number and any 1 -bits are shifted out of position 
31 ; otherwise XER[CA] is cleared to 0. An operand (rB) of zero 
causes rAto be loaded with the contents of rS, and XER[CA] to be 
cleared to 0. If rB[26]=1 , then rA is filled with 32 sign bits (bit 0) from 
rS. If rB[26]=0, then rA is filled from the left with sign bits. Condition 
register field CRO is set based on the value written into rA. 

sraw Shift Right Algebraic Word 

sraw. Shift Right Algebraic Word with CR Update. The dot suffix 
enables the update of the condition register. 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Left 
with fvlQ 


slq 
slq. 


rA.rS.rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left nbits where n is the shift amount specified 
in bits 27-31 of register rB. The rotated word is placed in the MQ 
register. 

When bit 26 of register rB is a zero, a mask of 32-n ones followed by 
n zeros is generated. 

When bit 26 of register rB is a one, a mask of all zeros is generated. 
The logical AND of the rotated word and the generated mask is 
placed into register rA. 

slq Shift Left with MQ 

slq. Shift Left with MQ with CR Update. The dot suffix enables 

the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Right 
with MQ 


srq 
srq. 


rA,rS,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. The rotated word is placed into 
the MQ register. When bit 26 of register rB is a zero, a mask of n 
zeros followed by 32-n ones is generated. 

When bit 26 of register rB is a one, a mask of all zeros is generated. 
The logical AND of the rotated word and the generated mask is 
placed in rA. 

srq Shift Right with MQ 

srq. Shift Right with MQ with CR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPG601 . 


Shift Left 
Immediate 
with MQ 


sliq 
sliq. 


rA,rS,SH 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left nbits where n is the shift amount specified 
by operand SH. The rotated word is placed in the MQ register. A 
mask of 32-n ones followed by n zeros is generated. The logical AND 
of the rotated word and the generated mask is placed into register rA. 

sliq Shift Left Immediate with MQ 

sliq. Shift Left Immediate with MQ with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction Is specific to the MPC601. 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Right 
Immediate 
with MQ 


sriq 
sriq. 


rA.rS.SH 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified by operand SH. The rotated word is placed into the MQ 
register. A masl< of n zeros foiiowed by 32-n ones is generated. The 
logical AND of the rotated word and tlie generated mask is placed in 
register rA. 

sriq Shift Right Immediate with MQ 

sriq. Shift Right Immediate with MQ with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601. 


Shift Left 
Long 

Immediate 
with MQ 


slliq 
slliq. 


rA,rS,SH 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left n bits where n is the shift amount specified 
by SH. A mask of 32-n ones followed by n zeros is generated. The 
rotated word is then merged with the contents of MQ, under control of 
the generated mask. The merged word is placed into rA. The rotated 
word is placed into the MQ register. 

slliq Shift Left Long Immediate with MQ 

slliq. Shift Left Long immediate with MQ with CR Update. The 

dot suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Right 
Long 

Immediate 
with MQ 


sriiq 
sriiq. 


rA.rS.SH 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified by operand SH. A mask of n zeros foiiowed by 32-n ones is 
generated. The rotated word is then merged with the contents of the 
MQ register, under control of the generated mask. The merged word 
is placed in register rA. The rotated word is placed into the MQ 
register. 

sriiq Shift Right Long Immediate with MQ 

sriiq. Shift Right Long immediate with MQ with CR Update. The 

dot suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Left 
Long with 
MQ 


sllq 
sllq. 


rA.rS.rB 


This is a POWER instruction, and Is not part ot the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left n bits where n is the shift amount specified 
in bits 27-31 of register rB. 

When bit 26 of register rB is a zero, a mask of 32-n ones followed by 
n zeros is generated. The rotated word is then merged with the 
contents of the MQ register, under control of the generated mask. 

When bit 26 of register rB is a one, a mask of 32-n zeros followed by 
n ones is generated. A word of zeros is then merged with the contents 
of the MQ register, under control of the generated mask. 

The merged word is placed in register rA. The MQ register is not 
altered. 

sllq Shift Left Long with MQ 

sllq. Shift Left Long with MQ with CR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Right 
Long with 
MQ 


sriq 
sriq. 


rA,rS,rB 


Tills is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. 

When bit 26 of register rB is a zero, a mask of n zeros followed by 
32-n ones is generated. The rotated word is then merged with the 
contents of the MQ register, under control of the generated mask. 

When bit 26 of register rB is a one, a mask of n ones followed by 32- 
n zeros is generated. A word of zeros is then merged with the 
contents of the MQ register, under control of the generated mask. 

The merged word is placed in register rA. The MQ register is not 
altered. 

sriq Shift Right Long with MQ 

sriq. Shift Right Long with MQ with CR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Left 
Extended 


sle 
sle. 


rA.rS.rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left nbits where nis the shift amount specified 
in bits 27-31 of register rB. The rotated word is placed in the MQ 
register. A mask of 32-n ones followed by n zeros is generated. 

The logical AND of the rotated word and the generated mask is 
placed in register rA. 

sle Shift Left Extended 

sle. Shift Left Extended with OR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPG601 . 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Right 
Extended 


sre 
sre. 


rA,rS,rB 


This is a POWER instruction, and is not part of tiie PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. The rotated word is placed into 
the MQ register. A mask of n zeros followed by 32-n ones is 
generated. 

The logical AND of the rotated word and the generated mask is 
placed in register rA. 

sre Shift Right Extended 

sre. Shift Right Extended with CR Update. The dot suffix 

enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Left 
Extended 
with MQ 


sleq 
sleq. 


rA.rS.rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left n bits where n is the shift amount specified 
in bits 27-31 of register rB. A mask of 32-n ones followed by n zeros 
is generated. The rotated word is then merged with the contents of 
the MQ register, under control of the generated mask. The merged 
word is placed in register rA. The rotated word is placed in the MQ 
register. 

sleq Shift Left Extended with MQ 

sleq. Shift Left Extended with MQ with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 


Shift Right 

Extended 

WithMQ 


sreq 
sreq. 


rA,rS,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. A mask of n zeros followed by 
32-n ones is generated. The rotated word is then merged with the 
contents of the MQ register, under control of the generated mask. 
The merged word is placed in register rA. The rotated word is placed 
into the MQ register. 

sreq Shift Right Extended with MQ 

sreq. Shift Right Extended with MQ with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601. 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Right 
Algebraic 
Immediate 
with MQ 


sraiq 
sraiq. 


rA,rS,SH 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified by the operand SH. A mask of n zeros followed by 32-n 
ones is generated. The rotated word is placed in the MQ register. 

The rotated word is then merged with a word of 32 sign bits from 
register rS, under control of the generated masl<. The merged word is 
placed in register rA. The rotated word is ANDed with the 
complement of the generated mask. This 32-bit result is ORed 
together and then ANDed with bit of register rS to produce 
XER[CA]. 

Shift Right Algebraic instructions can be used for a fast divide by 2^ if 
followed with addze. 

sraiq Shift Right Algebraic Immediate with MQ 
sraiq. Shift Right Algebraic Immediate with MQ with OR Update. 
The dot suffix enables the update of the condition register. 
This instruction is specific to the MPC601. 


Shift Right 
Algebraic 
with I^Q 


sraq 
sraq. 


rA,rS,rB 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. When bit 26 of register rB is a 
zero, a mask of n zeros followed by 32-n ones is generated. When bit 
26 of register rB is a one, a mask of ail zeros is generated. The 
rotated word is placed in the MQ register. The rotated word is then 
merged with a word of 32 sign bits from register rS, under control of 
the generated mask. 

The merged word is placed in register rA. 

The rotated word is ANDed with the complement of the generated 
mask. This 32-bit result is ORed together and then ANDed with bit 
of register rS to produce XER[CA]. 

Shift Right Algebraic instructions can be used for a fast divide by 2'^ if 
followed with addze. 

sraq Shift Right Algebraic with MQ 

sraq. Shift Right Algebraic with MQ with CR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the MPC601 . 
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Table 3-9. Integer Shift Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Shift Right 

Extended 

Algebraic 


srea 
srea. 


rA,rS,rB 


TTiis is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

Register rS is rotated left 32-n bits where n is the shift amount 
specified in bits 27-31 of register rB. A mask of n zeros followed by 
32-n ones Is generated. The rotated word is placed in the MQ 
register. 

The rotated word is then merged with a word of 32 sign bits from 
register rS, under control of the generated mask. 

The merged word is placed in register rA. 

The rotated word is ANDed with the complement of the generated 
mask. This 32-bit result is ORed together and then ANDed with bit 
of register rS to produce XER[CA]. 

srea Shift Right Extended Algebraic 

srea. Shift Right Extended Algebraic with OR Update. The dot 

suffix enables the update of the condition register. 
This instruction is specific to the f\/IPC601 . 



3.4 Floating-Point Instructions 

This section describes the floating-point instructions, which include the following: 

Floating-point arithmetic instructions 
Floating-point multiply-add instructions 
Floating-point rounding and conversion instructions 
Floating-point compare instructions 
Floating-point status and control register instructions 

Floating-point loads and stores are discussed in Section 3.5, "Load and Store Instructions." 

3.4.1 Floating-Point Arithmetic Instructions 

Single-precision instructions execute faster than their double-precision equivalents in the 
MPC601. For additional details on floating-point performance, refer to Chapter?, 
"Instruction Timing." 
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The floating-point arithmetic instructions are summarized in Table 3-10. 
Table 3-10. Floating-Point Arithmetic Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point Add 


fadd 
fadd. 


frD,frA,frB 


The floating-point operand in register frA is added to the floating- 
point operand in register frB. If the most significant bit of the resultant 
significand is not a one the result is normalized. The result is 
rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed Into register frD. 

Floating-point addition is based on exponent comparison and 
addition of the two significands. The exponents of the two operands 
are compared, and the significand accompanying the smaller 
exponent is shifted right, with its exponent increased by one for each 
bit shifted, until the two exponents are equal. The two significands 
are then added algebraically to form an intermediate sum. All 53 bits 
in the significand as well as all three guard bits (G, R, and X) enter 
into the computation. 

If a carry occurs, the sum's significand is shifted right one bit position 
and the exponent is increased by one. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fadd Floating-Point Add 

fadd. Floating-Point Add with CR Update. The dot suffix 
enables the update of the condition register. 


Floating- 
Point Add 
Single- 
Precision 


fadds 
fadds. 


frD,frA,frB 


The floating-point operand in register frA is added to the floating- 
point operand in register frB. If the most significant bit of the resultant 
significand is not a one, the result is normalized. The result is 
rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into register frD. 

Floating-point addition is based on exponent comparison and 
addition of the two significands. The exponents of the two operands 
are compared, and the significand accompanying the smaller 
exponent is shifted right, with its exponent increased by one for each 
bit shifted, until the two exponents are equal. The two significands 
are then added algebraically to form an intermediate sum. All 53 bits 
in the significand as well as all three guard bits (G, R, and X) enter 
into the computation. 

If a carry occurs, the sum's significand is shifted right one bit position 
and the exponent is increased by one. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1. 

fadds Floating-Point Single-Precision 

fadds. Floating-Point Single-Precision with CR Update. The dot 
suffix enables the update of the condition register. 
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Table 3-10. Floating-Point Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Subtract 


fsub 
fsub. 


frD,frA,frB 


The floating-point operanci in register frB is subtracteci from ttie 
floating-point operanci in register frA. If the most significant bit of the 
resultant significand is not a 1 the result is normalized. The result is 
rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into register frD. 

The execution of the Floating-Point Subtract instruction is identical to 
that of Floating-Point Add, except that the contents of register frB 
participates in the operation with its sign bit (bit 0) inverted. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fsub Floating-Point Subtract 

fsub. Floating-Point Subtract with CR Update. The dot suffix 
enables the update of the condition register. 


Floating- 
Point 
Subtract 
Single- 
Precision 


fsubs 
fsubs. 


frD,frA,frB 


The floating-point operand in register frB is subtracted from the 
floating-point operand in register frA. If the most significant bit of the 
resultant significand is not a 1 the result is normalized. The result is 
rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into register frD. 

The execution of the Floating-Point Subtract instruction is identical to 
that of Floating-Point Add, except that the contents of register frB 
participates in the operation with its sign bit (bit 0) inverted. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1. 

fsubs Floating-Point Subtract Single-Precision 
fsubs. Floating-Point Subtract Single-Precision with CR Update. 
The dot suffix enables the update of the condition register. 


Floating- 
Point 
Multiply 


fmul 
fmul. 


frD,frA,frC 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. 

If the most significant bit of the resultant significand is not a one, the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

Floating-point multiplication is based on exponent addition and 
multiplication of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fmul Floating-Point Multiply 

fmul. Floating-Point Multiply with CR Update. The dot suffix 
enables the update of the condition register. 
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Table 3-10. Floating-Point Arithmetic Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Multiply 
Single- 
Precision 


fmuls 
fmuls. 


frD,frA,frC 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

Floating-point multiplication is based on exponent addition and 
multiplication of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[\/E]=1 . 

fmuls Floating-Point Multiply Single-Precision 
fmuls. Floating-Point Multiply Single-Precision with CR Update. 
The dot suffix enables the update of the condition register. 


Floating- 
Point Divide 


fdiv 
fdiv. 


frD,frA,frB 


The floating-point operand in register frA is divided by the floating- 
point operand in register frB. No remainder is preserved. 

If the most significant bit of the resultant significand is not a 1 , the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

Floating-point division is based on exponent subtraction and division 
of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 and zero divide 
exceptions when FPSCR[ZE]=1. 

fdiv Floating-Point Divide 

fdiv. Floating-Point Divide with CR Update. The dot suffix 
enables the update of the condition register. 


Floating- 
Point 
Divide 
Single- 
Precision 


fdivs 
fdivs. 


frD,frA,frB 


The floating-point operand in register frA is divided by the floating- 
point operand in register frB. No remainder is preserved. 

If the most significant bit of the resultant significand is not a 1 , the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

Floating-point division is based on exponent subtraction and division 
of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 and zero divide 
exceptions when FPSCR[ZE]=1. 

fdivs Floating-Point Divide Single-Precision 

fdivs. Floating-Point Divide Single-Precision with CR Update. 

The dot suffix Gnablcs tho update of the condition register. 
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3.4.2 Floating-Point l\/luitipiy-Add Instructions 

These instructions combine multiply and add operations without an intermediate rounding 
operation. The fractional part of the intermediate product is 106 bits wide, and all 106 bits 
take part in the add/subtract portion of the instruction. 

The floating-point multiply-add instructions are summarized in Table 3-11. 

Table 3-11. Floating-Point Multiply-Add Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Multiply- 
Add 


fmadd 
fmadd. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
frB is added to this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fmadd Floating-Point Multiply-Add 

fmadd. Floating-Point Multiply-Add with CR Update. The dot 
suffix enables the update of the condition register. 


Floating- 
Point 
Muitipiy- 
Add 
Single- 
Precision 


fmadds 
fmadds. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
ifrB is added to this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fmadds Floating-Point Multiply-Add Single-Precision 
fmadds. Floating-Point Multiply-Add Single-Precision with CR 

Update. The dot suffix enables the update of the 

condition register. 


Floating- 
Point 
Multiply- 
Subtract 


fmsub 
fmsub. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
frB is subtracted from this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fmsub Floating-Point Multiply-Subtract 

fmsub. Floating-Point Multiply-Subtract with CR Update. The dot 
suffix enables the update of the condition register. 
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Table 3-11. Floating-Point Multiply-Add Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Multiply- 
Subtract 
Single- 
Precision 


fmsubs 
fmsubs. 


frD.frA.frC.frB 


The floating-point operand in register frA Is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
frB is subtracted from this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR and placed into register frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fmsubs Floating-Point Multiply-Subtract Single-Precision 
fmsubs. Floating-Point Multiply-Subtract Single-Precision with CR 

Update. The dot suffix enables the update of the 

condition register. 


Floating- 
Point 
Negative 
Multiply- 
Add 


fnmadd 
fnmadd. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
frB is added to this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR, then negated and placed into register frD. 

This instruction produces the same result as would be obtained by 
using the floating-point multiply-add instruction and then negating the 
result, with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid 
operation exception have a "sign" bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled 
invalid operation exception retain the "sign" bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[\/E] = 1. 

fnmadd Floating-Point Negative Multiply-Add 

fnmadd. Floating-Point Negative Multiply-Add with CR Update. 

The dot suffix enables the update of the condition register. 



MOTOROLA 



Chapter 3. Addressing Modes and Instruction Set Summary 



3-35 



Table 3-11 . Floating-Point Multiply-Add Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Negative 
Multiply- 
Add 
Single- 
Precision 


fnmadds 
fnmadds. 


frD,frA,frC,frB 


The floating-point operand in register frA Is multiplied by tiie floating- 
point operand in register frC. The floating-point operand in register 
frB is added to this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result Is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR, then negated and placed into register frD. 

This instruction produces the same result as would be obtained by 
using the floating-point muitiply-add instruction and then negating the 
result, with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid 
operation exception have a "sign" bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled 
Invalid operation exception retain the "sign" bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE] = ■]. 

fnmadds Floating-Point Negative Muitipiy-Add Single-Precision 
fnmadds. Floating-Point Negative Muitipiy-Add Single-Precision with 

CR Update. The dot suffix enables the update of the 

condition register. 


Floating- 
Point 
Negative 
Multiply- 
Subtract 


fnmsub 
fnmsub. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
ifrB is subtracted from this intermediate result. 

If the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR, then negated and placed into register frD. 

This instruction produces the same result as would be obtained by 
using the floating-point muitipiy-subtract instruction and then negating 
the result, with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid 
operation exception have a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled 
invalid operation exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fnmsub Floating-Point Negative Muitiply-Subtract 
fnmsub. Floating-Point Negative Muitiply-Subtract with CR Update. 
The dot suffix enables the update of the condition register. 
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Table 3-11. Floating-Point Multiply-Add Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Negative 
Multiply- 
Subtract 
Single- 
Precision 


fnmsubs 
fnmsubs. 


frD,frA,frC,frB 


The floating-point operand in register frA is multiplied by the floating- 
point operand in register frC. The floating-point operand in register 
frB is subtracted from this intermediate result. 

if the most significant bit of the resultant significand is not a one the 
result is normalized. The result is rounded to the target precision 
under control of the floating-point rounding control field RN of the 
FPSCR, then negated and placed into register frD. 

This instruction produces the same result as would be obtained by 
using the floating-point multiply-subtract instruction and then negating 
the result, with the following exceptions: 

• QNaNs propagate with no effect on their "sign" bit. 

• QNaNs that are generated as the result of a disabled invalid 
operation exception have a "sign" bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled 
invalid operation exception retain the "sign" bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

fnmsubs Floating-Point Negative r\/lultiply-Subtract Single-Precision 
fnmsubs. Floating-Point Negative Multiply-Subtract Single- 
Precision with CR Update. The dot suffix enables the 
update of the condition register. 



3.4.3 Floating-Point Rounding and Conversion Instructions 

The floating-point rounding instruction is used to produce a 32-bit single-precision number 
from a 64-bit double-precision floating-point number. The floating-point convert 
instructions convert 64-bit double-precision floating point numbers to 32-bit signed integer 
numbers. 

On Floating-Point Convert to Integer Word (fctiw) and Floating-Point Convert to Integer 
Word with Round toward Zero (fctiwz), the PowerPC architecture defines bits 0-31 of 
floating-point register frD as undefined. In the MPC601, these bits take on the value 
x'FFFS 0000' (which is the representation for a QNaN). This value may differ in future 
PowerPC processors, and software should avoid dependence on this MPC601 feature. 

The floating-point rounding instructions are shown in Table 3-12. 






MOTOROLA 



Chapter 3. Addressing Modes and Instruction Set Summary 



3-37 



Examples of uses of these instructions to perform various conversions can be found in 
Appendix F, "Floating-Point Models." 



Table 3-12. Floating-Point Rounding and Conversion Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Round to 
Single- 
Precision 


frsp 
frsp. 


frD.frB 


If it is already in single-precision range, the floating-point operand in 
register frB is placed into register frD. Otherwise the floating-point 
operand in register frB is rounded to single-precision using the 
rounding mode specified by FPSCR[RN] and placed into register frD. 

The rounding is described fully in Appendix F, "Floating-Point 
Models." 

FPSCR[FPRF] is set to the class and sign of the result, except for 
invalid operation exceptions when FPSCR[VE]=1 . 

frsp Floating-Point Round to Single-Precision 

frsp, Floating-Point Round to Single-Precision with OR 

Update. The dot suffix enables the update of the 

condition register. 


Floating- 
Point 

Convert to 
Integer 
Word 


fctiw 
fctiw. 


frD.frB 


The floating-point operand in register frB is converted to a 32-bit 
signed integer, using the rounding mode specified by FPSCR[RN], 
and placed in bits 32-63 of register frD. Bits 0-31 of register frD are 
undefined. 

If the operand in register frB is greater than 2^' - 1 , bits 32-63 of 
register frD are set to x 7FFF_FFFF'. 

if the operand in register frB is less than -2^^ , bits 32-63 of register 
frD are set to x '8000_0000'. 

The conversion is described fully in Appendix F, "Floating-Point 
Models." 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] 
is undefined. FPSCR[FR] is set if the result is incremented when 
rounded. FPSCR[FI] is set if the result is inexact. 

fctiw Floating-Point Convert to Integer Word 
fctiw. Floating-Point Convert to Integer Word with CR Update. 
The dot suffix enables the update of the condition register. 
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Table 3-12. Floating-Point Rounding and Conversion Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 

Convert to 
Integer 
Word with 
Round 


fctiwz 
fctiwz. 


frD.frB 


The floating-point operand in register frB is converted to a 32-bit 
signed integer, using the rounding mode Round toward Zero, and 
placed in bits 32-63 of register frD. Bits 0-31 of register frD are 
undefined. 

If the operand in frB is greater than 2^1 - 1 , bits 32-63 of frD are set 
to X 7FFF_FFFF'. 

If the operand in register frB is less than -2^' , bits 32-63 of register 
frD are set to x '8000_0000'. 

The conversion is described fully in Appendix F, "Floating-Point 
Models." 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] 
is undefined. FPSCR[FR] is set if the result is incremented when 
rounded. FPSCR[FI] is set if the result is inexact. 

fctiwz Floating-Point Convert to Integer Word with Round Toward 

Zero 
fctiwz. Floating-Point Convert to Integer Word with Round Toward 

Zero with CR Update. The dot suffix enables the update 

of the condition register. 



3.4.4 Floating-Point Compare Instructions 

Floating-point compare instructions compares the contents of two floating-point registers 
and the comparison ignores the sign of zero (that is +0 = -0). The comparison can be 
ordered or unordered. The comparison sets one bit in the designated CR field and clears the 
other three bits. The FPCC (floating-point condition code; bits 16-19 in the floating-point 
status and control register) is set in the same way. 

The CR field and the FPCC are interpreted as shown in Table 3-13. 

Table 3-13. CR Bit Settings 



Bit 


Name 


Description 





FL 


(frA) <{frB) 


1 


FG 


(frA) > (frB) 


2 


FE 


(frA) = (frB) 


3 


FU 


(frA) ? (frB) (unordered) 



On Floating-Point Compare Unordered (fcmpu) and Floating-Point Compare Ordered 
(fcrnpo) instructions with condition register updating enabled, the PowerPC architecture 
defines CRl and the CR field specified by operand crfD as undefined. 

The floating-point compare instructions are summarized in Table 3-14. 
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Table 3-14. Floating-Point Compare Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point 
Compare 
Unordered 


fcmpu 


crfD,frA,frB 


The floating-point operand in register frAis compared to the floating- 
point operand in register frB. The result of the compare is placed into 
CR field crfD and the FPCC. 

If an operand is a NaN, either quiet or signalling, CR field crfD and 
the FPCC are set to reflect unordered. If an operand is a Signalling 
NaN, VXSNAN is set. 


Floating- 
Point 
Compare 
Ordered 


fcmpo 


crfD.frA.frB 


The floating-point operand in register frA is compared to the floating- 
point operand in register frB. The result of the compare is placed into 
CR field crfD and the FPCC. 

If an operand is a NaN, either quiet or signalling, CR field crfD and 
the FPCC are set to reflect unordered. If an operand Is a Signalling 
NaN, VXSNAN is set, and if invalid operation is disabled (VE=0) then 
VXVC is set. Otherwise, if an operand is a Quiet NaN, VXVC is set. 



3.4.5 Floating-Point Status and Control Register Instructions 

Every FPSCR instruction appears to synchronize the effects of all floating-point 
instructions executed by a given processor. Executing an FPSCR instruction ensures that 
all floating-point instructions previously initiated by the given processor appear to have 
completed before the FPSCR instruction is initiated and that no subsequent floating-point 
instructions appear to be initiated by the given processor until the FPSCR instruction has 
completed. In particular: 

• All exceptions caused by the previously initiated instructions are recorded in the 
FPSCR before the FPSCR instruction is initiated. 

• All invocations of the floating-point exception handler that caused by the previously 
initiated instructions have occurred before the FPSCR instruction is initiated. 

• No subsequent floating-point instruction that depends on or alters the settings of any 
FPSCR bits appears to be initiated until the FPSCR instruction has completed. 

Floating-point memory access instructions are not affected. 

The floating-point status and control register instructions are summarized in Table 3-15. 
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Table 3-15. Floating-Point Status and Control Register Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Move from 
FPSCR 


mffs 
mffs. 


frD 


The contents of the FPSCR are placed into bits 32-63 of register frD. 
In the MPC601 , bits 0-31 of floating-point register frD are set to the 
value X 'FFFF_FFFF'. 

mffs Move from FPSCR 

mffs. Move from FPSCR with CR Update. The dot suffix 
enables the update of the condition register. 


Move to 
Condition 
Register 
from FPSCR 


mcrfs 


crfD.crfS 


The contents of FPSCR field specified by operand crfS are copied to 
the CR field specified by operand erf D. All exception bits copied are 
cleared to zero in the FPSCR. 


Move to 
FPSCR 
Field 
Immediate 


mtfsfi 
mtfsfi. 


crfD.IMM 


The value of the IMM field is placed into FPSCR field erf D. All other 
FPSCR fields are unchanged. 

mtfsfi Move to FPSCR Field Immediate 

mtfsfi. Move to FPSCR Field Immediate with CR Update. The 

dot suffix enables the update of the condition register. 
When FPSCR[0-3] is specified, bits (FX) and 3 (OX) are set to the 
values of IMM[0] and IMM[3] (i.e., even if this instruction causes OX 
to change from to 1 , FX is set from IMM[0] and not by the usual rule 
that FX is set to 1 when an exception bit changes from to 1). Bits 1 
and 2 (FEX and VX) are set according to the usual rule described in 
2.2.3, "Floating-Point Status and Control Register (FPSCR)," and not 
from lMM[1-2]. 


Move to 
FPSCR 
Fields 


mtfsf 
mtfsf. 


FM.frB 


Bits 32-63 of register frB are placed into the FPSCR under control of 
the field mask specified by FM. The field mask identifies the 4-bit 
fields affected. Let / be an integer in the range 0-7. If FM=1 then 
FPSCR field /(FPSCR bits 4*/through 4*;+3) is set to the contents 
of the corresponding field of the low-order 32 bits of register frB. 

mtfsf Move to FPSCR Fields 

mtfsf. Move to FPSCR Fields with CR Update. The dot suffix 

enables the update of the condition register. 
In other PowerPC implementations, the mtfsf instruction may 
perform more slowly when only a portion of the fields are updated. 
This is not the case in the MPC601 . 

When FPSCR[0-3] is specified, bits (FX) and 3 (OX) are set to the 
values of frB[32] and frB[35] (i.e., even if this instruction causes OX 
to change from to 1 , FX is set from frB[32] and not by the usual rule 
that FX is set to 1 when an exception bit changes from to 1). Bits 1 
and 2 (FEX and VX) are set according to the usual rule described in 
2.2.3, "Floating-Point Status and Control Register (FPSCR)," and not 
from frB[33-34]. 


Move to 
FPSCR Bit 


IllllOUWf 

mtfsbO. 


crbD 


The bit cf the FPSCR specified by operand crbD is cleared to 0. 

Bits 1 and 2 (FEX and VX) cannot be explicitly reset. 

mtfsbO Move to FPSCR Bit 

mtfsbO. Move to FPSCR Bit with CR Update. The dot suffix 
enables the update of the condition register. 
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Table 3-15. Floating-Point Status and Control Register Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Move to 
FPSCR Bit 1 


mtfsbl 
mtfsbl. 


crbD 


The bit of the FPSCR specified by operand crbD is set to 1 . 

Bits 1 and 2 (FEX and VX) cannot be reset explicitly. 

mtfsbl Move to FPSCR Bit 1 

mtfsbl. Move to FPSCR Bit 1 with CR Update. The dot suffix 
enables the update of the condition register. 



3.5 Load and Store Instructions 

This section describes the load and store instructions of the MPC601, which consist of the 
following: 

Integer load instructions 

Integer store instructions 

Integer load and store with byte reversal instructions 

Integer load and store multiple instructions 

Floating-point load instructions 

Floating-point store instructions 

Floating-point move instructions 

Memory synchronization instructions 

3.5.1 Integer Load and Store Address Generation 

Integer load and store operations generate effective addresses using register indirect with 
immediate index mode, register indirect with index mode or register indirect mode. 

3.5.1.1 Register Indirect with Immediate Index Addressing 

Instructions using this addressing mode contain a signed 16-bit immediate index (d 
operand) which is sign extended to 32 bits, and added to the contents of a general purpose 
register specified in the instruction (rA operand) to generate the effective address. A zero 
in place of the rA operand causes a zero to be added to the immediate index (d operand). 
The option to specify rA or is shown in the instruction descriptions as (rAlO). 

Figure 3-1 shows how an effective address is generated when using register indirect with 
immediate index addressing. 
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Figure 3-1. Register Indirect with Immediate Index Addressing 

3.5.1.2 Register Indirect with Index Addressing 

Instructions using this addressing mode cause the contents of two general purpose registers 
(specified as operands r A and rB) to be added in the generation of the effective address. A 
zero in place of the rA operand causes a zero to be added to the contents of the general 
purpose register specified in operand rB. The option to specify rA or is shown in the 
instruction descriptions as (rAlO). 

Figure 3-2 shows how an effective address is generated when using register indirect with 
index addressing. 
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Figure 3-2. Register Indirect with Index Addressing 
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3.5.1.3 Register Indirect Addressing 

Instructions using this addressing mode use the contents of the general purpose register 
specified by the rA operand as the effective address. A zero in the rA operand causes an 
effective address of zero to be generated. The option to specify rA or is shown in the 
instruction descriptions as (rAlO). 

Figure 3-3 shows how an effective address is generated when using register indirect 
addressing. 
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Figure 3-3. Register Indirect Addressing 

3.5.2 Integer Load Instructions 

For load instructions, the byte, half-word, word, or double- word addressed by EA is loaded 
into rD. Many integer load instructions have an update form, in which rA is updated with 
the generated effective address. For these forms, if rA ^ and rA ^ rD, the effective 
address is placed into rA and the memory element (byte, half-word, or word) addressed by 
EA is loaded into rD. 

Note that non-MPC601 implementations of the architecture may run the load half algebraic 
instructions (Iha, Ihax) and the load with update (Ibzu, Ibzux, Ihzu, Ihzux, Ihau, Ihaux) 
instructions with greater latency than other types of load instructions. In the MPC601, all 
of these instructions operate with the same latency as other load instructions. For details on 
instruction timing, see Chapter 7, "Instruction Timing." 
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The PowerPC architecture defines load with update instructions with rA=0 or rA=rD as an 
invalid form. In the POWER architecture, these forms are not considered invalid and 
specifications exist for these cases. To maintain compatibility with the POWER 
architecture, for the case where rA=0, the MPC601 does not update rO. In cases where 
rA=rD, the load data is loaded into rD and tlie register r A update is suppressed. In addition, 
the PowerPC architecture defines integer load instructions with the condition register 
update option enabled to be an invalid form and the POWER architecture does not. For 
compatibility, the MPC601 executes the instruction in a manner consistent with the 
PowerPC architecture and it causes an undefined value to be placed into the condition 
register CRO field. 

Table 3-16 summarizes the load instructions available for the MPC601. 



Table 3-16 Integer Load Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load Byte 
and Zero 


Ibz 


rD,d(rA) 


The effective address is the sum (rAIO)+d. The byte in memory 
addressed by the EA is loaded into register rD[24-31]. The remaining 
bits in register rD are cleared to 0. 


Load Byte 
and Zero 
Indexed 


Ibzx 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The byte in memory 
addressed by the EAis loaded into register rD[24-31]. The remaining 
bits in register rD are cleared to 0. 


Load Byte 
and Zero 
with Update 


Ibzu 


rD,d(rA) 


The effective address (EA) is the sum (rA|0)+d. The byte in memory 
addressed by the EAis loaded into register rD[24-31]. The remaining 
bits in register rD are cleared to 0. The EAis placed into register rA. If 
operand rA=0 the MPC601 does not update rO, or if rA=rD the load 
data is loaded into register rD and the register update is suppressed. 
Although the PowerPC architecture defines load with update 
instructions with operand rA=0 or rA=rD as invalid forms, the 
MPC601 allows these cases. 


Load Byte 
and Zero 
with 
Update 
Indexed 


Ibzux 


rD,rA,rB 


The effective address (EA)is the sum (rA|0)+(rB). The byte 
addressed by the EA is loaded into register rD[24-31]. The remaining 
bits in register rD are cleared to 0. The EAis placed into register rA. If 
operand rA=0 the MPC601 does not update register rO, or if rA=rD 
the load data is loaded into register rD and the register update is 
suppressed. Although the PowerPC architecture defines load with 
update instructions with operand rA=0 or rA=rD as invalid forms, the 
MPC601 allows these cases. 


Load 

Half Word 
and Zero 


Ihz 


rD,d(rA) 


The effective address is the sum (rA|0)+d. The half-word in memory 
addressed by the EA is loaded into register rD[1 6-31]. The remaining 
bits in rD are cleared to 0. 


Load 

Half Word 
and Zero 
Indexed 


Ihzx 


rD,rA,rB 


The effective address is the sum (rA10)+(rB). The half-word in 
memory addressed by the EA is loaded into register rD[1 6-31 ]. The 
remaining bits in register rD are cleared. 
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Table 3-16 Integer Load Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load 

Half Word 
and Zero 
with Update 


Ihzu 


rD,d(rA) 


The effective address is the sum (rA|0)+d. The half-word in memory 
addressed by the EA is loaded into register rD[1 6-31 ]. The remaining 
bits in register rD are cleared. 

The EA is placed into register rA. 

If operand rA=0 the MPC601 does not update register rO, or if rA=rD 
the load data is loaded into register rD and the register update is 
suppressed. Although the PowerPC architecture defines load with 
update instructions with operand rA=0 or rA=rD as invalid forms, the 
MPC601 allows these cases. 


Load 

Half Word 
and Zero 
with 
Update 
Indexed 


Ihzux 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The half-word in 
memory addressed by the EA is loaded into register rD[1 6-31]. The 
remaining bits in register rD are cleared. The EA Is placed Into 
register rA. Although the PowerPC architecture defines load with 
update instructions with operand rA=0 or rA=rD as invalid forms, the 
MPC601 allows these cases. 


Load 

Half Word 
Algebraic 


Iha 


rD,d(rA) 


The effective address is the sum (rA)+d. The half-word in memory 
addressed by the EAis loaded into register rD[1 6-31]. The remaining 
bits In register rD are filled with a copy of bit of the loaded half-word. 


Load 

Half Word 
Algebraic 
Indexed 


Ihax 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The half-word in 
memory addressed by the EAis loaded into register rD[1 6-31]. The 
remaining bits in register rD are filled with a copy of bit of the loaded 
half-word. 


Load 

Half Word 
Algebraic 
with Update 


Ihau 


rD,d{rA) 


The effective address is the sum (rA|0)+d. The half-word in memory 
addressed by the EAis loaded into register rD[1 6-31]. The remaining 
bits in register rD are filled with a copy of bit of the loaded half-word. 
The EA is placed into register rA. If operand rA=0 the f\/lPC601 does 
not update register rO, or if rA=rD the load data is loaded into register 
rD and the register update is suppressed. Although the PowerPC 
architecture defines load with update instructions with operand rA=0 
or rA=rD as invalid forms, the MPC601 allows these cases. 


Load 

Half Word 

Algebraic 

with 

Update 

Indexed 


Ihaux 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The half-word in 
memory addressed by the EAis loaded into register rD[1 6-31]. The 
remaining bits in register rD are filled with a copy of bit of the loaded 
half-word. The EA is placed into register rA. If operand rA=0 the 
MPC601 does not update rO, or if rA=rD the load data is loaded into 
register rD and the register update is suppressed. Although the 
PowerPC architecture defines load with update instructions with 
operand rA=0 or rA=rD as invalid forms, the MPC601 allows these 
cases. 


Load Word 
and Zero 


Iwz 


rD,d(rA) 


The effective address is the sum (rA|0)+d. The word in memory 
addressed by the EAis loaded into register rD[0-31]. 


Load Word 
and Zero 
Indexed 


Iwzx 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The word in memory 
addressed by the EAis loaded into register rD[0-31]. 
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Table 3-16 Integer Load Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load Word 
and Zero 
with Update 


Iwzu 


rD,d(rA) 


The effective address is the sum (rA|0)+d. The word in memory 
addressed by the EA is loaded into register rD[0-31]. The EA is 
placed into register rA. If operand rA=0 the f>/lPC601 does not update 
register rO, or if rA=rD the load data is loaded into register rD and the 
register update is suppressed. Although the PowerPC architecture 
defines load with update instructions with operand rA=0 or rA=rD as 
invalid forms, the MPC601 allows these cases. 


Load Word 
and Zero 
with 
Update 
Indexed 


Iwzux 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). The word in memory 
addressed by the EAis loaded into register rD[0-31].The EAis 
placed into register rA. If operand rA=0 the MPC601 does not update 
register rO, or if rA=rD the load data is loaded into register rD and the 
register update is suppressed. Although the PowerPC architecture 
defines load with update instructions with operand rA=0 or rA=rD as 
invalid forms, the MPC601 allows these cases. 



3.5.3 Integer Store instructions 

For integer store instructions, the contents of register rS are stored into the byte, half-word, 
word or double-word in memory addressed by EA. Many store instructions have an update 
form, in which register rA is updated with the effective address. For these forms, the 
following rules apply: 

• If rA^iO, the effective address is placed into register rA. 

• If rS=r A, the contents of register rS are copied to the target memory element, then 
the generated EA is placed into rA. 

The PowerPC architecture defines store with update instructions with rA=0 as an invalid 
form. In the POWER architecture, this form is not considered invalid and specifications 
exist for these cases. To maintain compatibility with POWER in this case, the MPC601 
does not update register rO. In addition, PowerPC defines integer store instructions with the 
condition register update option enabled to be an invalid form and the POWER architecture 
does not. To maintain compatibility in these cases, the MPC601 executes the instruction as 
described in the PowerPC architecture, and it loads an undefined value into CRO field of the 
condition register. 

A summary of the integer store instructions provided by the MPC601 is shown in 
Table 3-17. 

Table 3-17. Integer Store instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Store Byte 


stb 


rS,d(rA) 


The effective address is the sum (rA|0)+d. Register rS[24-31] is 
stored into the byte in memory addressed by the EA. 


Store Byte 
Indexed 


stbx 


rS,rA,rB 


The effective address is the sum {rA|0)+(rB). rS[24-31] is stored into 
the byte in memory addressed by the EA. 
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Table 3-17. Integer Store Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Store Byte 
with Update 


stbu 


rS,d(rA) 


The effective address is the sum (rA|0)+d. rS[24-31] is stored into 
the byte in memory addressed by the EA. The EA is placed into 
register rA. 


Store Byte 
with 
Update 
Indexed 


stbux 


rS,rA,rB 


The effective address is the sum (rA|0)+(rB). rS[24-31] is stored into 
the byte in memory addressed by the EA. The EA is placed into 
register rA. 


Store 
Half word 


sth 


rS,d(rA) 


The effective address is the sum (rA|0)+d. rS[1 6-31] Is stored into 
the half-word in memory addressed by the EA. 


Store 

half-word 

Indexed 


sthx 


rS,rA,rB 


The effective address (EA) is the sum (rA|0)+(rB). rS[1 6-31] is stored 
into the half-word in memory addressed by the EA. 


Store 
Halt word 
with Update 


sthu 


rS,d{rA) 


The effective address is the sum (rA|0)+d. rS[16-31] is stored into 
the half-word in memory addressed by the EA. The EA is placed into 
register rA. 


Store 
Half word 
with 
Update 
Indexed 


sthux 


rS,rA,rB 


The effective address is the sum (rA|0)+(rB). rS[16-31] is stored into 
the half-word in memory addressed by the EA. The EA is placed into 
register rA. 


Store Word 


stw 


rS,d(rA) 


The effective address is the sum (rA|0)+d. Register rS is stored into 
the word in memory addressed by the EA. 


Store Word 
Indexed 


stwx 


rS,rA,rB 


The effective address is the sum (rA|0)+(rB). rS is stored into the 
word in memory addressed by the EA. 


Store Word 
with Update 


stwu 


rS,d(rA) 


The effective address is the sum (rA|0)+d. 

Register rS is stored into the word in memory addressed by the EA. 

The EA is placed into register rA.. 


Store Word 
with 
Update 
Indexed 


stwux 


rS,rA,rB 


The effective address is the sum (rA|0)+(rB). Register rS is stored 
into the word in memory addressed by the EA. The EA is placed into 
register rA. 



3.5.4 Integer Load and Store with Byte Reversal Instructions 

Table 3-18 describes integer load and store with byte reversal instruction. Note that in other 
PowerPC implementations, load byte-reverse instructions may have greater latency than 
other load instructions. 
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This is not the case in the MPC601 . These instructions operate with the same latency as 
other load instructions. 

Table 3-18. Integer Load and Store with Byte Reversal Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load 

Half Word 
Byte- 
Reverse 
Indexed 


Ihbrx 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). Bits 0-7 of the half- 
word in memory addressed by the EAare loaded into rD[24-31]. 
Bits 8-1 5 of the half-word in memory addressed by the EA are 
loaded into rD[16-23]. The rest of the bits in rD are cleared to 0. 


Load Word 
Byte- 
Reverse 
Indexed 


Iwbrx 


rD,rA,rB 


The effective address is the sum (rA|0)+(rB). Bits 0-7 of the word 
in memory addressed by the EA are loaded into rD[24-31]. Bits 
8-15 of the word in memory addressed by the EA are loaded into 
rD[1 6-23]. Bits 1 6-23 of the word in memory addressed by the 
EA are loaded into rD[8-15]. Bits 24-31 of the word in memory 
addressed by the EA are loaded into rD[0-7]. 


Store 
Half Word 
Byte- 
Reverse 
Indexed 


sthbrx 


rS.rA.rB 


The effective address is the sum (rA|0)+(rB). rS[24-31] are stored 
into bits 0-7 of the half-word in memory addressed by the EA. 
rS[1 6-23] are stored into bits 8-1 5 of the half-word in memory 
addressed by the EA. 


Store Word 
Byte- 
Reverse 
Indexed 


stwbrx 


rS.rA.rB 


The effective address is the sum (rA|0)+(rB). rS[24-31] are stored 
into bits 0-7 of the word in memory addressed by EA. Register 
rS[1 6-23] are stored into bits 8-1 5 of the word in memory 
addressed by the EA. Register rS[8-1 5] are stored into bits 1 6-23 
of the word in memory addressed by the EA. rS[0-7] are stored 
into bits 24-31 of the word in memory addressed by the EA. 



3.5.5 Integer Load and Store Multiple Instructions 

The load/store multiple instructions are used to move blocks of data to and from the GPRs. 
The load multiple and store multiple instructions may have operands that require memory 
accesses crossing a 4-Kbyte page boundary. As a result, these instructions may be 
interrupted by a data access exception associated with the address translation of the second 
page. In this case, the MPC601 performs all of the memory references from the first page, 
and none of the memory references from the second page before taking the exception. For 
additional information, refer to Section 5.4,3, "Data Access Exception (x'()0300')." 

The PowerPC architecture defines the load multiple instruction (Imw) with rA in the range 
of registers to be loaded as an invalid form. In the POWER architecture, this form is not 
considered invalid and specifications exist for these cases. To maintain compatibility with 
the POWER architecture in this case, the MPC601 will execute the instruction normally, 
except that the loading of register rA is skipped. If rA=0, the register is not considered to 
be actually used for addressing, and the update of rO (if it is in the range of registers to be 
loaded) is loaded. In addition, the PowerPC architecture defines the load multiple and store 
multiple instructions with misaligned operands (that is, the EA is not a multiple of 4) to be 
an invalid form and the POWER architecture does not. To maintain compatibility with the 
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POWER architecture, the MPC601 executes these instructions subject to the performance 
degradation as described in 5.4.6.1, "Integer Alignment Exceptions." 

Table 3-19. Integer Load and Store Multiple Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load 

Multiple 

Word 


Imw 


rD,d(rA) 


The effective address is thie sum (rA|0)+d. 

n = 32 -rD. 

n consecutive words starting at EA are loaded into GPRs rD ttirough 
31 . If the EA is not a multiple of 4 the alignment exception handler 
may be invoked if a page boundary is crossed. 


Store 

Multiple 

Word 


stmw 


rS,d{rA) 


The effective address is the sum (rA|0)+d. 

n=(32-rS). 

n consecutive words starting at the EA are stored from GPRs rS 
through 31. 

If the EA is not a multiple of 4 the alignment exception handler may 
be invol<ed if a page boundary is crossed. 



3.5.6 Integer Move String Instructions 

The integer move string instructions allow movement of data from memory to registers or 
from registers to memory without concern for aUgnment. These instructions can be used for 
a short move between arbitrary memory locations or to initiate a long move between 
misaligned memory fields. 

Load/store string indexed instructions of zero length have no effect, except that load string 
indexed instructions of zero length may set register rD to an undefined value. 

Load string and store string instructions may involve operands that are not word-aligned. 
As described in Section 5.4.6, "AHgnment Exception (x'00600')," misaligned string 
operations will suffer a performance penalty as compared to an aligned operation of the 
saine type. Non-word-aUgned string operations that cross a 4-Kbyte boundary as well as 
word-aligned string operations that cross a 256-Mbyte boundary always cause an alignment 
exception. Other non-word-aUgned string operations that cross a double-word boundary 
also are slower than word-aligned string operations. 

Although string operations that are word-aligned and cross a 4-Kbyte boundary operate at 
the MPC601 's fastest rate, these instructions may be interrupted by a data access exception 
associated with the address translation of the second page. In this case, the MPC601 
performs all memory references from the first page and none from the second before taking 
the exception. For more information, refer to Section 5.4.3, "Data Access Exception 
(x'(X)3()()')." 

The Load String and Compare Byte Indexed (Iscbx) instruction can lead to several 
architecturally undefined results. When the last register loaded is only partially filled, the 
remaining bytes are considered to be undefined. If loading is terminated due to a byte 
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match, all succeeding bytes are considered to be undefined. In addition, if the condition 
register update option is enabled, and XER[25-31]=0, condition register field CRO is 
undefined. In all of these cases, the MPC601 does not guarantee particular results for these 
undefined fields. The values should simply be treated as undefined. 

If the EA associated with an Iscbx instruction is directed to a memory-forced I/O controller 
interface segment (that is, the segment register T-bit is set and the BUID field equals 
X '07F'), the address is translated appropriately and the operation proceeds. On the other 
hand, if the EA associated with an Iscbx instruction is directed to an I/O segment (that is, 
the segment register T-bit is set but the BUID does not equal x'07F'), then the MPC601 
takes a data access exception and sets bit 5 of the DSISR. 

If rA is in the range of registers to be loaded for a Load String Word Immediate (Iswi) 
instruction or if either rA or rB are in the range of registers to be loaded for a Load String 
Word Indexed (Iswx) or Iscbx instruction, then the PowerPC architecture considers the 
instruction to be of an invalid form. In the POWER architecture, this form is not considered 
invalid and specifications exist for these cases. To maintain compatibility with the POWER 
architecture in this case, the MPC601 executes the instruction normally, but loading of 
these registers is inhibited. In addition, the Iswx, Iscbx and stswx instructions that specify 
a string length of zero are considered an invalid form in the PowerPC architecture, but not 
in the POWER architecture. For compatibility with the POWER architecture, the MPC6()1 
executes these instructions normally, but does not alter register rD or cause a memory 
access. 

Table 3-20. Integer Move String Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load String 

Word 

Immediate 


Iswi 


rD.rA.NB 


TheEAis(rA|0). 

Let n = NB if NB^tO, n = 32 if NB=0; n is tlie number of bytes to load. 
Let nr= (n/4); nr\s the number of registers to receive data. 

n consecutive bytes starting at the EA are loaded into GPRs rD 
through rD+nr-1 . Bytes are loaded left to right in each register. The 
sequence of registers wraps around to rO if required. If the four bytes 
of register rD■^■nr-^ are only partially filled, the unfilled low-order 
byte{s) of that register are cleared to 0. 

If rA is in the range of registers specified to be loaded, it will be 
skipped in the load process. If operand rA=0, the register is not 
considered as used for addressing, and will be loaded. 
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Table 3-20. Integer Move String Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load String 

Word 

Indexed 


Iswx 


rD,rA,rB 


TheEAisthesum(rA|0)+(rB). 

Let n = XER[25-31]; n is the number of bytes to load. 

Let nr= CEIL(n/4); oris the number of registers to receive data. 

If n>Q, n consecutive bytes starting at the EAare loaded into registers 
rD through rD+nr-l. 

Bytes are loaded left to right in each register. The sequence of 
registers wraps around to rO if required. If the four bytes of register 
rD+nr-^ are only partially filled, the unfilled low-order byte(s) of that 
register are cleared to 0. 

If n=0, the contents of register rD is undefined. 

If rA is in the range of registers specified to be loaded, it will be 
skipped in the load process. If operand rA=0, the register is not 
considered as used for addressing, and will be loaded. 


Load String 

and 

Compare 

Byte 

Indexed 


Iscbx 
Iscbx. 


rD.rA.rB 


The EA is the sum (rA|0)+(rB). XER[25-31] contains the byte count. 
Register rD is the starting register. /7=XER[25-31], which is the 
number of bytes to be loaded. nr=CEIL(n/4), which is the number of 
registers to receive data. Starting with the leftmost byte in rD, 
consecutive bytes in storage addressed by the EA are loaded into rD 
through rD+nr-1, wrapping around back through GPR if required, 
until either a byte match is found with XER[16-23] or n bytes have 
been loaded. If a byte match is found, that byte is also loaded. 

Bytes are always loaded left to right in the register. In the case when 
a match was found before n bytes were loaded, the contents of the 
rightmost byte(s) not loaded of that register and the contents of all 
succeeding registers up to and including rD+nr-1 are undefined. Also, 
no reference is made to storage after the matched byte is found. In 
the case when a match was not found, the contents of the rightmost 
byte(s) not loaded of rD+nr-1 is undefined. 

When XER[25-31]=0, the content of rD is unchanged. The count of 
the number of bytes loaded up to and including the matched byte, if a 
match was found, is placed in XER[25-31]. 

Iscbx Load String and Compare Byte Indexed 
Iscbx. Load String and Compare Byte indexed with CR 

Update. The dot suffix enables the update of the 

condition register. 


Store 
String 
Word 
Immediate 


stswi 


rS,rA,NB 


TheEAis(rA|0). 

Let n = NB if NB^tO, n = 32 if NB=0; n is the number of bytes to store. 

Let nr= CEIL(n/4); nr is the number of registers to supply data. 

n consecutive bytes starting at the EA are stored from register rS 
through rS+nr-1 . 

Bytes are stored left to right from each register. The sequence of 
registers wraps around through rO if required. 
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Table 3-20. Integer Move String Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Store 
String 
Word 
Indexed 


stswx 


rS,rA,rB 


The effective address is the sum (rA|0)+(rB). 

Let n = XER[25-31]; n is the number of bytes to store. 

Let nr= CEIL(n/4); nr\s the number of registers to supply data. 

n consecutive bytes starting at the EA are stored from register rS 
through rS+nM. 

Bytes are stored left to right from each register. The sequence of 
registers wraps around through rO if required. 



3.5.7 Memory Synchronization Instructions 

Memory synchronization instructions can control the order in which memory operations 
are completed with respect to asynchronous events and the order in which memory 
operations are seen by other processors and by other mechanisms that access memory. 
Additional information about these instructions and about related aspects of memory 
management can be found in Chapter 6, "Memory Management Unit." 

The synchronize (sync) and the Enforce In-order Execution of I/O (eielo) instructions are 
handled in the same manner internally to the MPC601 . These instructions delay execution 
of subsequent instructions until all previous instructions have completed to the point that 
they can no longer cause an exception, all previous memory accesses are performed 
globally, and the sync or eieio operation is broadcast onto the MPC601 bus interface. 

System designs that use a second-level cache should take special care in accepting the 
broadcast sync operation and performing the appropriate actions to guarantee that memory 
references that may be queued internally to the second-level cache have been performed 
globally. 

The number of cycles the sync and eielo instructions take depends on various system-level 
sensitivities and on the processor's state when the instruction is issued. As a result, frequent 
use of these instructions may cause some performance degradation. 

Note that the PowerPC architecture defines the sync instruction with the condition register 
update option enabled to be an invalid form whereas the POWER architecture does not. For 
compatibility, the MPC601 executes this case of the instruction consistently with the 
PowerPC architecture, and it loads an undefined value into condition register field CRO. 

The Instruction Synchronize (isync) instruction causes the MPC601 to purge its instruction 
buffers, wait for any preceding sync instructions to complete and then branch to the next 
sequential instruction (which has the effect of clearing the pipeline behind the isync 
instruction.) 

The Load Word and Reserve Indexed (Iwarx) and Store Word Conditional Indexed 
(stwcx.) instructions provide an atomic update function for a single, aligned word of 
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memory. The Iwarx instruction must be paired with a stwcx. instruction with the same 
effective address used for both instructions of the pair. 

The Iwarx and stwcx. instructions require the EA to be aligned. Software should not 
attempt to emulate a misaligned Iwarx or stwcx. instruction because there is no correct way 
to define the address associated with the reservation. 

The granularity with which reservations are managed is 32 bytes. Therefore the memory to 
be accessed by a load and reserve and store conditional instruction should be allocated by 
a system library program. Examples of correct uses of these instructions, to emulate 
primitives such as "Fetch and Add," "Test and Set," and "Compare and Swap," can be 
found Appendix G, "Synchronization Programming Examples." In general, these 
instructions should be used only in system programs, which can be invoked by application 
programs as needed. 

At the most one reservation exists on any given processor — there are not separate 
reservations for words and for double words. The address associated with the reservation 
can be changed by a subsequent Iwarx instruction. The conditionality of the store 
conditional instruction's store is based only on whether a reservation exists, not on a match 
between the address associated with the reservation and the address computed from the EA 
of the stwcx. instruction. A reservation is cleared by executing a stwcx. instruction to any 
address by the processor having the reservation, by executing any store instruction to the 
address associated with the reservation, by another processor, execution of an so instruction 
or any exception incurred by the processor which invoked the reservation. 

The memory synchronization instructions available for the MPC601 are summarized in 
Table 3-21. 

Table 3-21 . Memory Synchronization Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Enforce In- 
Order 
Execution 
of I/O 


eieio 




Tlie eieio instruction provides an ordering function for tlie effects of 
load and store instructions executed by a given processor. Executing 
an eieio instruction ensures that all memory accesses previously 
initiated by the given processor are complete with respect to main 
memory before allowing any memory accesses subsequently initiated 
by the given processor to access main memory. 

The eieio instruction orders load and store operations to cache 
inhibited memory, and store operations to write through cache 
memory. 

The eieio instruction performs the same function as a sync 
instruction when executed by the MPC601 . 


Instruction 
Synchronize 


isync 




This instruction waits for all previous instructions to complete, and 
then discards any prefetched instructions, causing subsequent 
instructions to be fetched (or refetched) from memory and to execute 
in the context established by the previous instructions. This 
instruction has no effect on other processors or on their caches. 
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Table 3-21 . Memory Synchronization Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load Word 
and 

Reserve 
Indexed 


Iwarx 


rD.rA.rB 


The effective address is the sum (rA|0)+(rB). The word in memory 
addressed by the EA is loaded into register rD. 

This instruction creates a reservation for use by a stwcx. instruction. 
An address computed from the EA is associated with the reservation, 
and replaces any address previously associated with the reservation. 

The EA must be a multiple of 4. If it is not, the alignment exception 
handler will be invoked if the word loaded crosses a page boundary, 
or the results may be undefined. 


Store Word 
Conditional 
Indexed 


stwcx. 


rS.rA.rB 


The effective address is the sum (rA|0)+{rB). 

If a reservation exists, register rS is stored into the word in memory 
addressed by the EA and the reservation is cleared. 

If a reservation does not exist, the instruction completes without 
altering memory. 

The EQ bit in the condition register field CRO is modified to reflect 
whether the store operation was performed (i.e., whether a 
reservation existed when the stwcx. instruction began execution). If 
the store was completed successfully, the EQ bit is set to one. 

The EA must be a multiple of 4; otherwise, the alignment exception 
handler will be invoked if the word stored crosses a page boundary, 
or the results may be undefined. 


Synchronize 


sync 




Executing a sync instruction ensures that all instructions previously 
initiated by the given processor appear to have completed before any 
subsequent instructions are initiated by the given processor. When 
the sync instruction completes, all memory accesses initiated by the 
given processor prior to the sync will have been performed with 
respect to all other mechanisms that access memory. The sync 
instruction can be used to ensure that the results of all stores into a 
data structure, performed in a "critical section" of a program, are seen 
by other processors before the data structure is seen as unlocked. 

The Enforce In-Order Execution of I/O (eieio) instruction may be 
more appropriate than sync for cases in which the only requirement 
is to control the order in which memory references are seen by I/O 
devices. 



3.5.8 Floating-Point Load and Store Address Generation 

Floating point load and store operations generate effective addresses using the register 
indirect with immediate index mode and register indirect with index mode, the details of 
which are described below. Floating-point loads and stores are not supported for 1/0 
accesses when the SR[BUID] is not equal to x'07F'. The use of floating-point loads and 
stores for I/O access will result in an alignment exception. 

3.5.8.1 Register Indirect with Immediate Index Addressing 

Instructions using this addressing mode contain a signed 16-bit immediate index (d 
operand) which is sign extended to 32 bits, and added to the contents of a general purpose 
register specified in the instruction (rA operand) to generate the effective address. A zero 
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in the rA operand causes a zero to be added to the immediate index (d operand). This is 
shown in the instruction descriptions as (rAlO). 

Figure 3-4 shows how an effective address is generated when using register indirect with 
immediate index addressing. 
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Instruction Encoding: opcode fro/frs rA 
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63 
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Store 



Load 



Memory 
Access 



31 



31 



31 



Figure 3-4. Register Indirect with Immediate index Addressing 

3.5.8.2 Register Indirect with Index Addressing 

Instructions using this addressing mode add the contents of two general purpose registers 
(specified in operands rA and rB) to generate the effective address. A zero in the rA 
operand causes a zero to be added to the contents of general purpose register specified in 
operand rB. This is shown in the instruction descriptions as (rAlO). 

Figure 3-5 shows how an effective address is generated when using register indirect with 
index addressing. 
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Figure 3-5 Register Indirect with Index Addressing 

The PowerPC architecture defines floating-point load and store with update instructions 
(Ifsu, Ifsux, Ifdu, Ifdux, stfsu, stfsux, stfdu, stfdux) with operand rA=0 as invalid forms 
of the instructions, but the POWER architecture does not. To maintain compatibility with 
the POWER architecture, the MPC6()1 accesses memory for these cases but inhibits the 
update of the integer register rO. 

In addition, the PowerPC architecture defines floating-point load and store instructions with 
the condition register update option enabled to be an invalid form. For compatibility with 
the POWER architecture, the MPC601 executes the instruction normaUy, but also writes an 
undefined value into the condition register field CRl. 

The PowerPC architecture defines that the FPSCR[UE] bit should not be used to determine 
whether denormalization should be performed on floating-point stores. The MPC601 
complies with this definition, although this is different from some POWER architecture 
implementations. 

3.5.9 Floating-Point Load Instructions 

There are two basic forms of floating-point load instruction — single-precision and double- 
precision formats. Because the FPRs support only floating-point, double-precision format, 
single-precision floating-point load instructions convert single-precision data to double- 
precision format before loading the operands into the target FPR. This conversion is 
described in Section 3.6.9.1, "Double-Precision Conversion for Floating-Point Load 
Instructions." Table 3-22 provides a summary of the floating-point load instructions. 
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Table 3-22. Floating-Point Load Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load 
Floating- 
Point 
Single- 
Precision 


Its 


frD,d(rA) 


The effective address is the sum (rA|0)+d. 

The word in memory addressed by the EA is interpreted as a floating- 
point single-precision operand. This word is converted to floating- 
point double-precision format and placed into register frD. 


Load 
Floating- 
Point 
Single- 
Precision 
Indexed 


Ifsx 


frD.rA.rB 


The effective address is the sum (rA|0)+(r B). 

The word in memory addressed by the EA is interpreted as a floating- 
point single-precision operand. This word is converted to floating- 
point double-precision and placed into register frD. 


Load 
Floating- 
Point 
Single- 
Precision 
with Update 


Ifsu 


frD,d(rA) 


The effective address is the sum (rA|0)+d. 

The word in memory addressed by the EA is interpreted as a floating- 
point single-precision operand. This word is converted to floating- 
point double-precision (see Section 3.6.9.1, "Double-Precision 
Conversion for Floating-Point Load Instructions,") and placed into 
register frD. 

The EA is placed into the register specified by rA. 


Load 
Floating- 
Point 
Single- 
Precision 
with 
Update 
Indexed 


Ifsux 


frD.rA.rB 


The effective address is the sum (rA|0)+(r B). 

The word in memory addressed by the EA is interpreted as a floating- 
point single-precision operand. This word is converted to floating- 
point double-precision (see Section 3.6.9.1, "Double-Precision 
Conversion for Floating-Point Load Instructions,") and placed into 
register frD. 

The EA is placed into the register specified by rA. 


Load 
Floating- 
Point 
Double- 
Precision 


Ifd 


frD,d(rA) 


The effective address is the sum (rA|0)+d. 

The double-word in memory addressed by the EA is placed into 
register frD. 


Load 
Floating- 
Point 
Double- 
Precision 
Indexed 


Ifdx 


frD,rA,rB 


The effective address is the sum (rA|0)+(r B). 

The double-word in memory addressed by the EAis placed into 
register frD. 


Load 
Floating- 
Point 
Double- 
Precision 
with Update 


Ifdu 


frD,d{rA) 


The effective address is the sum (rA|0)+d. 

The double-word in memory addressed by the EA is placed into 
register frD. 

The EA is placed into the register specified by rA. 
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Table 3-22. Floating-Point Load Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Load 
Floating- 
Point 
Double- 
Precision 
with 
Update 
Indexed 


Ifdux 


frD.rA.rB 


The effective address is the sum (rA|0)+(r B). 

The double-word in memory addressed by the EA is placed into 
register frD. 

The EAis placed into the register specified by rA. 



3.5.9.1 Double-Precision Conversion for Floating-Point Load 
Instructions 

The steps for converting from single- to double-precision and loading are as follows: 

W0RD[()-31] is the floating-point, single-precision operand accessed from memory. 

Normalized Operand 

If W0RD[i_8] >() and W0RD[i_8] <255 

frD 0-1 < WORD 0-1 

frD 2 < -WORD 1 

frD 3 < ^WORD 1 

frD 4 < -nWORD 1 

frD 5-63 < WORD 2-31 ll^^b'O' 

Denormalized Operand 

If WORD i_8 =0 and WORD 9.3 1 9^0 

sign < WORD 

exp<-126 

frac 0-52 < b'O' II WORD 9.31 II 29b'0' 

normalize the operand 

Do while frac =0 

frac < frac i_52 II b'O' 

exp < exp - 1 

End 

frD < sign 

frDi_ii < exp -f 1023 

frD 12-63 < frac i_52 

Infinity / QNaN / SNaN / Zero 
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If WORD i_8 =255 or WORD i_3i =0 

frD 0-1 < WORD o-i 

frD 2 < WORD 1 

frD 3 < WORD 1 

frD 4 < WORD 1 

frD 5-63 < WORD 2-31 l|29b'0' 

For double-precision floating-point load instructions, no conversion is required as the data 
from memory is copied directly into the FPRs. 

Many floating-point load instructions have an update form in which register rA is updated 
with the EA. For these forms, if operand rA^^O, the effective address is placed into register 
rA and the memory element (word or double-word) addressed by the EA is loaded into the 
floating-point register specified by operand frD. 

3.5.10 Floating-Point Store Instructions 

This section describes floating-point store instructions. There are two basic forms of the 
store instruction — single- and double-precision. Because the FPRs support only floating- 
point, double-precision format, single-precision floating-point store instructions convert 
double-precision data to single-precision format before storing the operands. The 
conversion steps are described in Section 3.6.9.2.1, "Double-Precision Conversion for 
Floating-Point Store Instructions." Table 3-23 is a summary of the floating point store 
instructions provided by the MPC601 . 

Table 3-23 Floating-Point Store Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Store 
Floating- 
Point 
Single- 
Precision 


stfs 


frS,d(rA) 


The EAIs the sum (rA|0)+d. 

The contents of register frS is converted to single-precision and 

stored into the word in memory addressed by the EA. 


Store 
Floating- 
Point 
Single- 
Precision 
Indexed 


stfsx 


frS.rA.rB 


The EA is the sum (rA|0)+(rB). 

The contents of register frS is converted to single-precision and 

stored into the word in memory addressed by the EA. 


Store 
Floating- 
Point 
Single- 
Precision 
with Update 


stfsu 


frS,d(rA) 


The EAis the sum (rA|0)+d. 

The contents of register frS is converted to single-precision and 

stored into the word in memory addressed by the EA. 

The EAis placed into the register specified by operand rA. 
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Table 3-23 Floating-Point Store Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Store 
Floating- 
Point 
Single- 
Precision 
with 
Update 
Indexed 


stfsux 


frS.rA.rB 


The EAis the sum (rA|0)+(rB). 

The contents of register frS is converted to single-precision and 
stored into the word in memory addressed by the EA. 

The EA is placed into the register specified by operand rA. 


Store 
Floating- 
Point 
Double- 
Precision 


stfd 


frS,d{rA) 


The effective address is the sum (rA|0)+d. 

The contents of register frS is stored into the double-word in memory 
addressed by the EA. 


Store 
Floating- 
Point 
Double- 
Precision 
Indexed 


stfdx 


frS.rA.rB 


The EA is the sum (rA|0)+(rB). 

The contents of register frS is stored into the double-word in memory 
addressed by the EA. 


Store 
Floating- 
Point 
Double- 
Precision 
with Update 


stfdu 


frS,d(rA) 


The effective address is the sum (rA|0)+d. 

The contents of register frS is stored into the double-word in memory 
addressed by the EA. 

The EA is placed into register rA. 


Store 
Floating- 
Point 
Double- 
Precision 
with 
Update 
Indexed 


stfdux 


frS,rA,rB 


The EAis the sum (rA|0)+(rB). 

The contents of register frS is stored into the double-word in memory 
addressed by EA. 

The EAis placed into register rA. 



3.5.10.1 Double-Precision Conversion for Floating-Point Store 
Instructions 

The steps for converting single- to double-precision for floating-point store instructions are 
as follows: 

Let WORD[0-31] be the word in memory written to. 

No DsnormsHzstlon Reoulred 

If frS[l-ll] > 896 or frS[l-63] = 
W0RD[()-1] < frS[()-l] 
WORD[2-31]<frS[5-34] 
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Denormalization Required 

If874<frS[l-ll]<896 
sign < frS[0] 
exp<frS[l_li]- 1023 
frac<b'r ilfrS[12-63] 
Denormalize operand 
Do while exp< -126 
frac<b'0'llfrac0-62 
exp < exp + 1 
End 

WORDO < sign 
WORD[l-8]<x'00' 
WORD[9-3 1 ] < frac[ 1 -23] 

For double-precision floating-point store instructions, no conversion is required as the data 
from the FPRs is copied directly into memory. Many floating-point store instructions have 
an update form, in which register rA is updated with the effective address. For these forms, 
if operand rA ^ 0, the effective address is placed into register rA. 

Floating-point store instructions are listed in Table 3-23. Recall that rA, rB, and rD denote 
GPRs, while frA, frB, frC, frS and frD denote FPRs. 

3.5.11 Floating-Point !\/love Instructions 

Floating-point move instructions copy data from one floating-point register to another with 
data modifications as described for each instruction. These instructions do not modify the 
FPSCR. The condition register update option in these instructions controls the placing of 
result status into condition register field CRl . If the condition register update option is 
enabled, then CRl is set, otherwise CRl is unchanged. Floating-point move instructions are 
listed in Table 3-24. 
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Table 3-24. Floating-Point Move Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Floating- 
Point Move 
Register 


fmr 
fmr. 


frD.frB 


The contents of register frB is placed into frD. 

fmr Floating-Point Move Register 

fmr. Floating-Point Move Register with CR Update. The dot 
suffix enables the update of the condition register. 


Floating- 
Point 
Negate 


fneg 
fneg. 


frD.frB 


The contents of register frB with bit inverted is placed into register 
frD. 

fneg Floating-Point Negate 

fneg. Floating-Point Negate with CR Update. The dot suffix 
enables the update of the condition register. 


Floating- 
Point 
Absolute 
Value 


fabs 
fabs. 


frD.frB 


The contents of frB with bit cleared to is placed into frD. 

fabs Floating-Point Absolute Value 

fabs. Floating-Point Absolute Value with CR Update. The dot 
suffix enables the update of the condition register. 


Floating- 
Point 
Negative 
Absolute 
Value 


fnabs 
fnabs. 


frD.frB 


The contents of frB with bit set to one is placed into frD. 

fnabs Floating-Point Negative Absolute Value 
fnabs. Floating-Point Negative Absolute Value with CR Update. 
The dot suffix enables the update of the condition register. 



3.6 Flow Control Instructions 

Branch instructions are executed by the BPU. Some of these instructions can redirect 
instruction execution conditionally based on the value of bits in the condition register. 
When the branch processor encounters one of these instructions, it scans the execution 
pipelines to determine whether an instruction in progress may affect the particular 
condition register bit. If no interlock is found, the branch can be resolved immediately by 
checking the bit in the condition register and taking the action defined for the branch 
instruction. 

If an interlock is detected, the branch is considered unresolved and the direction of the 
branch is predicted using the y-bit as described in Table 3-25. The interlock is monitored 
while instructions are fetched for the predicted branch. When the interlock is cleared, the 
branch processor determines whether the prediction was correct based on the value of the 
condition register bit. If the prediction is correct, the branch is considered completed and 
instruction fetching continues. If the prediction is incorrect, the prefetched instructions are 
purged, and instruction fetching continues along the alternate path. 

3.6.1 Branch instruction Address Calculation 

Branch instructions can change the sequence of instruction execution, Instmction addresses 
are always assumed to be on word boundaries with the MPC601; therefore the processor 
ignores the two low-order bits of the generated branch target address. 
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Branch instructions compute the effective address (EA) of the next instruction address 
using the following addressing modes: 

Branch relative 

Branch to absolute address 

Branch conditional to relative address 

Branch conditional to absolute address 

Branch conditional to link register 

Branch conditional to count register 

3.6.1.1 Branch Relative Address Mode 

Instructions that use branch relative addressing generate the next instruction address by 
sign extending the immediate displacement operand LI and adding the resultant value to 
the current instruction address. Branches using this address mode have the absolute 
addressing option (AA) disabled. If the link register update option (LK) is enabled, the 
effective address of the instruction following the branch instruction is placed in the link 
register. 

Figure 3-6 shows how the branch target address is generated when using the branch relative 
addressing mode. 
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Figure 3-6. Branch Relative Addressing 
3.6.1.2 Branch Conditional Relative Address Mode 

If the branch conditions are met, instructions that use the branch conditional relative 
address mode generate the next instruction address by sign extending the immediate 
displacement operand (BD) and adding the resultant value to the current instruction 
address. Branches using this address mode have the absolute addressing option (AA) 
disabled. If the link register update option (LK) is enabled, the effective address of the 
instruction following the branch instruction is placed in the link register. 
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Figure 3-7 shows how the branch target address is generated when using the branch 
conditional relative addressing mode. 
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Figure 3-7. Branch Conditional Relative Addressing 

3.6.1.3 Branch to Absolute Address Mode 

Instructions that use branch to absolute address mode generate the next instruction address 
by sign extending the LI operand. Branches using this address mode have the absolute 
addressing option (AA) enabled. If the link register update option (LK) is enabled, the 
effective address of the instruction following the branch instruction is placed in the link 
register. 

Figure 3-8 shows how the branch target address is generated when using the branch to 
absolute address mode. 
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Figure 3-8. Branch to Absolute Addressing 
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3.6.1.4 Branch Conditional to Absolute Address iVlode 

If the branch conditions are met, instructions that use the branch conditional to absolute 
address mode generate the next instruction address by sign extending the BD operand. 
Branches using this address mode have the absolute addressing option (AA) enabled. If the 
link register update option (LK) is enabled, the effective address of the instruction 
following the branch instruction is placed in the link register. 

Figure 3-9 shows how the branch target address is generated when using the branch 
conditional to absolute address mode. 
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Figure 3-9. Branch Conditional to Absolute Addressing 

3.6.1.5 Branch Conditional to Link Register Address Mode 

If the branch conditions are met, the branch conditional to link register instruction 
generates the next instruction address by fetching the contents of the link register and 
clearing the two low order bits to zero. If the link register update option (LK) is enabled, 
the effective address of the instruction following the branch instruction is placed in the link 
register. 

Figure 3-10 shows how the branch target address is generated when using the branch 
conditional to link register address mode. 

3.6.1.6 Branch Conditional to Count Register 

If the branch conditions are met, the branch conditional to count register instruction 
generates the next instruction address by fetching the contents of the count register and 
clearing the two low order bits to zero. If the link register update option (LK) is enabled, 
the effective address of the instruction following the branch instruction is placed in the link 
register. 
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Instruction Encoding: 
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Figure 3-10. Branch Conditional to Link Register Addressing 

Figure 3-11 shows how the branch target address is generated when using the branch 
conditional to count register address mode. 



6 7 1112 16 17 21 22 30 31 



Instruction Encoding: i9 bo bi ooodo 528 lk 



CTR 



29 




31 



>■ Next Sequential Instruction Address 



■o- 



30 31 



31 



Branch Target Address 



Figure 3-11. Branch Conditional to Count Register Addressing 
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When the branch instructions contain immediate addressing operands, the target addresses 
can be computed sufficiently ahead of the branch instruction that instructions can be 
prefetched along the target path. If the branch instructions use the link and count registers, 
instructions along the target path can be prefetched if the link or count register is loaded 
sufficiently ahead of the branch instruction. 

Branching can be conditional or unconditional, and the return address can optionally be 
provided. If the return address is to be provided, the effective address of the instruction 
following the branch instruction is placed in the link register after the branch target address 
has been computed. This is done regardless of whether the branch is taken. 

For branch conditional instructions, the BO operand specifies the conditions under which 
the branch is taken. The first four bits of the BO operand specify how the branch is affected 
by or affects the condition and count registers. The fifth bit, shown in Table 3-25 as having 
the value y, may be used by some implementations for branch prediction as described 
below. 

The encodings for the BO operands are shown in Table 3-25. 

Table 3-25. BO Operand Encodings 



BO 


Description 


OOOOy 


Decrement the CTR, then branch if the decremented CTR vt and the condition is 
FALSE. 


0001 y 


Decrement the CTR, then branch if the decremented CTR = and the condition is 
FALSE. 


001 zy 


Branch if the condition is FALSE. 


OlOOy 


Decrement the CTR, then branch if the decremented CTR t^O and the condition is 
TRUE. 


0101y 


Decrement the CTR, then branch if the decremented CTR = and the condition is 
TRUE. 


Ollzy 


Branch if the condition is TRUE. 


IzOOy 


Decrement the CTR, then branch if the decremented CTR ^ 0. 


IzOly 


Decrement the CTR, then branch if the decremented CTR = 0. 


'\Z\ZZ 


Branch aiways. 



The vindicates a bit that must be zero; otherwise, the instruction form is invalid. 

The ybit provides a hint about whether a conditional branch is iil<ely to be taken and is used by the 
MPC601 to improve performance. Other implementations may ignore the ybit. 



The "branch always" encoding of the BO operand does not have a "y" bit. 
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Setting the ">'" bit to indicates that the following behavior is likely: 

• For bcx with a negative value in the displacement operand, the branch is taken. 

• In all other cases (bcx with a non-negative value in the displacement operand, bclrx, 
or bcctrx), the branch is not taken. 

Setting the "}'" bit to 1 reverses the preceding indications. 

The sign of the displacement operand is used as described above even if the target is an 
absolute address. The default value for the "3^" bit should be 0, and should only be set to 1 
if software has determined that the prediction corresponding to "}'" = 1 is more likely to be 
correct than the prediction corresponding to "y" = 0. Software that does not compute branch 
predictions should set the "/' bit to zero. 

For all three of the branch conditional instructions, the branch should be predicted to be 
taken if the value of the following expression is 1, and to fall through if the value is 0. 

((BO[0] & B0[2]) I S) e B0[4] 

In the expression above, S (bit 1 6 of the branch conditional instruction coding) is the sign 
bit of the displacement operand if the instruction has a displacement operand and is if the 
operand is reserved. B0[4] is the "y" bit, or for the "branch always" encoding of the BO 
operand. (Advantage is taken of the fact that, for bclrx and bcctrx, bit 1 6 of the instruction 
is part of a reserved operand and therefore must be 0.) 

3.6.2 Bl Operand 

The 5-bit BI operand in branch conditional instructions specifies which of the 32 bits in the 
CR represents the condition to test. 

3.6.3 Basic Branch Mnemonics 

The mnemonics in Table 3-26 allow all the common BO operand encodings to be specified 
as part of the mnemonic, along with the absolute address (AA) and set link register (LK) 
bits. 

Notice that there are no simplified mnemonics for relative and absolute unconditional 
branches. For these, the basic mnemonics b, ba, bl, and bla are used. 
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Table 3-26. Simplified Branch Mnemonics 






Branch Semantics 


LR bit not set 


LR bit set 


be 

Relative 


bca 

Absolute 


bclr to 

LR 


bcctr 

to 
CTR 


bcl 

Relative 


bcIa 

Absolute 


bclrl to 

LR 


bcctri 

to CTR 


Branch unconditionally 


— 


— 


bir 


betr 


— 


— 


bin 


bctri 


Branch if condition true 


bt 


bta 


btir 


btctr 


btl 


btia 


btlrl 


btctrl 


Branch if condition 
false 


bf 


bfa 


bfir 


bfetr 


bfl 


bfia 


bflrl 


bfetri 


Decrement CTR, 
branch if CTR non-zero 


bdnz 


bdnza 


bdnzir 


— 


bdnzl 


bdnzia 


bdnzlrl 


— 


Decrement CTR, 
branch if CTR non-zero 
AhJD condition true 


bdnzt 


bdnzta 


bdnztir 


^" 


bdnztl 


bdnztia 


bdnztlrl 


" 


Decrement CTR, 
branch if CTR non-zero 
AND condition false 


bdnzf 


bdnzfa 


bdnzf Ir 


~~ 


bdnzfl 


bdnzfia 


bdnzflrl 


_- 


Decrement CTR, 
branch if CTR zero 


bdz 


bdza 


bdzir 


— 


bdzl 


bdzia 


bdzirl 


— 


Decrement CTR, 
branch if CTR zero 
AND condition true 


bdzt 


bdzta 


bdztir 


— 


bdztl 


bdztia 


bdztir! 


— 


Decrement CTR, 
branch if CTR zero 
AND condition false 


bdzf 


bdzfa 


bdzfir 


■^ 


bdzf! 


bdzfia 


bdzfiri 


~~ 



Table 3-26 provides the abbreviated set of simplified mnemonics for the most commonly 
performed conditional branches. Unusual cases of conditional branches can be coded using 
a basic branch conditional mnemonic (be, bclr, bcctr) with the condition to be tested 
specified as a numeric first operand. 

Instructions using a mnemonic from Table 3-26 that tests a condition specify the condition 
as the first operand of the instruction. Table 3-27 summarizes the mnemonic symbols and 
the equivalent numeric values used to interpret a condition register CR field during a branch 
conditional instruction compare operation. 
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Table 3-27. Condition Register CR Field Bit Symbols 



Symbol 


Value 


Meaning 


It 





Less than 


gt 


1 


Greater than 


eq 


2 


Equal 


so 


3 


Summary overflow 


un 


3 


Unordered (after floating- 
point comparison) 



Table 3-28 summarizes the mnemonic symbols and the equivalent numeric values used to 
identify the condition register CR field to be evaluated by the compare operation. 

Table 3-28. Condition Register CR Field Identification Symbols 



Symbol 


Value 


Meaning 


crO 





CRO 


cri 


4 


CRI 


cr2 


8 


CR2 


cr3 


12 


CR3 


cr4 


16 


CR4 


cr5 


20 


CR5 


cr6 


24 


CR6 


cr7 


28 


CR7 



The simplified branch mnemonics and the symbols in Table 3-27 and Table 3-28 are 
combined in an expression that identifies the bit (0-31) of CR to be tested, as follows: 

Examples: 

1 . Decrement CTR and branch if it is still non-zero (closure of a loop controlled by a 
count loaded into CTR). 

bdnz target (equivalent to be 16,0, target) 

2. Same as (1) but branch only if CTR is non-zero and condition in CRO is "equal." 
bdnz eq,target (equivalent to be 8,2,target) 

3. SaiP.e as (2), but "equal" condifion is in CR5. 

bdnzt cr5+eq,target (equivalent to be 8,22,target) 

4. Branch if bit 27 of CR is false. 

bf 27,target (equivalent to be 4,27,target) 

5. Same as (4), but set the link register. This is a form of conditional "call." 
bfl 27,target (equivalent to bcl 4,27,target) 



MOTOROLA 



Chapter 3. Addressing Modes and Instruction Set Summary 



3-71 



3.6.4 Branch Mnemonics Incorporating Conditions 

The mnemonics defined in Table 3-30 are variations of the "branch if condition true" and 
"branch if condition false" BO encodings, with the most common values of the BI operand 
represented in the mnemonic rather than specified as a numeric operand. 

The two-letter codes for the most common combinations of branch conditions is shown in 
Table 3-29. 

Table 3-29. Two-Letter Codes for Branch Comparison Conditions 



Code 


Meaning 


It 


Less than 


le 


Less than or equal 


eq 


Equal 


ge 


Greater than or equal 


gt 


Greater than 


nl 


Not less than 


ne 


Not equal 


ng 


Not greater than 


so 


Summary overflow 


ns 


Not summary overflow 


un 


Unordered (after floating-point comparison) 


nu 


Not unordered (after floating-point 
comparison) 



These codes are reflected in the simplified mnemonics shown in Table 3-30. 
Table 3-30. Simplified Branch Mnemonics Incorporating Comparison Conditions 



Branch Semantics 


LR bit not set 


LR bit set 


be 

Relativ 
e 


bca 

Absolute 


bclr to 

LR 


bcctr 

toCTR 


bcl 

Relative 


bcia 

Absolute 


bclr! to 

LR 


bcctrl 

toCTR 


Branch if less than 


bit 


bita 


bitir 


bitctr 


bitl 


bitia 


bitlrl 


bitctrl 


Branch if less than or equal 


ble 


blea 


bleir 


blectr 


blel 


biela 


blelrl 


blectrl 


Branch if equal 


beq 


beqa 


beqir 


beqctr 


beql 


beqia 


beqirl 


beqctr! 


Branch if greater than 


bge 


bgea 


bgeir 


bgectr 


bgel 


bgela 


bgelrl 


bgectrl 


Branch if greater than 


bgt 


bgta 


bgtir 


bgtctr 


bgtl 


bgtia 


bgtirl 


bgtctr! 


Branch if not less than 


bnl 


bnia 


bnllr 


bnlctr 


bnll 


bnlla 


bnlirl 


bnlctr! 


Branch if not equal 


bne 


bnea 


bneir 


bnectr 


bnel 


bnela 


bnelrl 


bnectr! 
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Table 3-30. Simplified Branch Mnemonics Incorporating Comparison Conditions 



Branch Semantics 


LR bit not set 


LR bit set 


be 

Relativ 
e 


bca 

Absolute 


bclrto 

LR 


bectr 

toCTR 


be! 

Relative 


bcIa 

Absolute 


belrl to 

LR 


bcctrl 

toCTR 


Branch if not greater than 


bng 


bnga 


bnglr 


bngctr 


bngi 


bngia 


bnglrl 


bngctrl 


Branch if summary 
overflow 


bso 


bsoa 


bsoir 


bsoctr 


bsol 


bsola 


bsolrl 


bsoctrl 


Branch if not summary 
overflow 


bns 


bnsa 


bnsir 


bnsctr 


bnsi 


bnsia 


bnslrl 


bnsctr! 


Branch if unordered 


bun 


buna 


bunir 


bunetr 


buni 


bunia 


bunlrl 


bunctrl 


Branch if not unordered 


bnu 


bnua 


bnuir 


bnuctr 


bnul 


bnuia 


bnulrl 


bnuctrl 



Instructions using the mnemonics in Table 3-30 specify the condition register tield in an 
optional first operand. If the CR field being tested is CRO, this operand need not be 
specified. Otherwise, one of the CR field symbols listed in Table 3-28 is coded as the first 
operand. 

Examples: 

1 . Branch if CRO reflects condition "not equal." 

bne target (equivalent to be 4,2,target) 

2. Same as (1), but condition is in CR3. 

bne cr3,target (equivalent to be 4,14,target) 

3. Branch to an absolute target if CR4 specifies "greater than," setting the link register. 
This is a form of conditional "call", as the return address is saved in the link register. 

bgtia er4,target (equivalent to bela 1 2, 1 7,target) 

4. Same as (3), but target address is in the count register. 
bgtctrl cr4 (equivalent to bcctrl 12,17) 
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3.6.5 Branch Instructions 

Table 3-31 describes the branch instructions provided by the MPC601, 

Table 3-31. Branch Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Branch 


b 
ba 
bl 
bla 


imm_addr 


b Branch. Branch to the address computed as the sum of 
the immediate address and the address of the current 
instruction. 

ba Branch Absolute. Branch to the absolute address 
specified. 

bl Branch then Link. Branch to the address computed as the 
sum of the immediate address and the address of the 
current instruction. The instruction address following this 
instruction is placed into the link register (LR). 

bla Branch Absolute then Link. Branch to the absolute 

address specified. The instruction address following this 
instruction is placed into the link register (LR). 


Branch 
Conditional 


be 
bca 
bcl 
bcia 


BO.BJ, 
target_addr 


The Bl operand specifies the bit in the condition register (OR) to be 
used as the condition of the branch. The BO operand is used as 
described in Table 3-25. 

be Branch Conditional. Branch conditionally to the address 
computed as the sum of the immediate address and the 
address of the current instruction. 

bea Branch Conditional Absolute. Branch conditionally to the 
absolute address specified. 

bcl Branch Conditional then Link. Branch conditionally to the 
address computed as the sum of the immediate address 
and the address of the current instruction. The instruction 
address following this instruction is placed into the link 
register. 

bela Branch Conditional Absolute then Link. Branch 

conditionally to the absolute address specified. The 
instruction address following this instruction is placed into 
the link register. 


Branch 
Conditional 
to Link 
Register 


bclr 
bciri 


BO.BI 


The Bl operand specifies the bit in the condition register to be used 
as the condition of the branch. The BO operand is used as described 
in Table 3-25. 

bclr Branch Conditional to Link Register. Branch conditionally 

to the address in the link register. 
bclrl Branch Conditional to Link Register then Link. Branch 

conditionally to the address specified in the link register. 

The instruction address following this instruction is then 

placed into the link register. 
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Table 3-31 . Branch Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Branch 
Conditional 
to Count 
Register 


bcctr 
bcctrl 


BG.BI 


The Bi operand specifies the bit in the condition register to be used 
as the condition of the branch. The BO operand is used as described 
in Table 3-25. 

bcctr Branch Conditional to Count Register. Branch 

conditionally to the address specified in the count register. 
bcctrl Branch Conditional to Count Register then Link. Branch 

conditionally to the address specified in the count register. 

The instruction address following this instruction is placed 

into the link register. 
Note: If the "decrement and test CTR" option is specified (BO[2]=0), 
the instruction form is invalid. For the MPC601 , the decremented 
count register is tested for zero and branches based on this test, but 
instruction fetching is directed to the address specified by the non- 
decremented version of the count register. Use of this invalid form of 
this instruction is not recommended. 



3.6.6 Condition Register Logical Instructions 

Similar to the system call (sc) instruction, condition register logical instructions, shown in 
Table 3-32, and the move condition register field (mcrf) instruction are defined as flow 
control instructions, although they are executed by the lU. 

Note that if the link register update option (LR) is enabled for any of these instructions, the 
PowerPC architecture defines these forms of the instructions as invalid; however, the 
MPC601 executes these instructions and leaves the link register in an undefined state. 

Table 3-32. Condition Register Logical Instructions 



Name 


IVInemonic 


Operand 
Syntax 


Operation 


Condition 
Register 
AND 


crand 


crbD.crbA.crbB 


The bit in the condition register specified by crbAis ANDed with the 
bit in the condition register specified by crbB. The result is placed 
into the condition register bit specified by crbD. 


Condition 
Register 
OR 


cror 


crbD.crbA.crbB 


The bit in the condition register specified by crbAis ORed with the 
bit in the condition register specified by crbB. The result is placed 
into the condition register bit specified by crbD. 


Condition 
Register 
XOR 


crxor 


crbD,crbA,crbB 


The bit in the condition register specified by crbAis XORed with the 
bit in the condition register specified by crbB. The result is placed 
into the condition register bit specified by crbD. 


Condition 
Register 
NAND 


crnand 


crbD.crbA.crbB 


The bit in the condition register specified by crbAis ANDed with the 
bit in the condition register specified by crbB. The complemented 
result is placed into the condition register bit specified by crbD. 


Condition 
Register 
NOR 


crnor 


crbD,crbA,crbB 


The bit in the condition register specified by crbA is ORed with the 
bit in the condition register specified by crbB. The complemented 
result is placed into the condition register bit specified by crbD. 
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Table 3-32. Condition Register Logical Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Condition 
Register 
Equivale 
nt 


creqv 


crbD.crbA, 
crbB 


The bit in the condition register specified by crbAis XORed with the 
bit in the condition register specified by crbB. The complemented 
result is placed into the condition register bit specified by crbD. 


Condition 

Register 

AND 

with 

Comple 

ment 


crandc 


crbD.crbA, 
crbB 


The bit in the condition register specified by crbA is ANDed with the 
complement of the bit in the condition register specified by crbB and 
the result is placed into the condition register bit specified by crbD. 


Condition 
Register 
OR with 
Comple 
ment 


crorc 


crbD.crbA, 
crbB 


The bit in the condition register specified by crbA is ORed with the 
complement of the bit in the condition register specified by crbB and 
the result is placed into the condition register bit specified by crbD. 


Move 
Condition 
Register 
Field 


mcrf 


crfD.crfS 


The contents of crfS are copied into erf D. No other condition 
register fields are changed. 



3.6.7 System Linkage Instructions 

This section describes the system linkage instructions (see Table 3-33). The system call (sc) 
instruction pennits a program to call on the system to perform a service and the system to 
return from performing a service or from processing an exception. 

Table 3-33. System Linkage Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operand Syntax 


System Gall 


sc 




When executed, the effective address of the instruction following the 
sc instruction is placed into SRRO. Bits 16-31 of the MSR are placed 
into bits 16-31 of SRR1, and bits 0-15 of SRR1 are set to undefined 
values. Then a system call exception is generated. The exception 
causes the MSR to be altered as described in Section 5.4, "Exception 
Definitions." 

The exception causes the next instruction to be fetched from offset 
x'COO' from the base physical address indicated by the new setting of 
MSR[IP]. For a discussion of POWER compatibility with respect to 
instruction bits 16-29, refer to the Appendix K, "Incompatibilities with 
the POWER Architecture. To ensure compatibility with future versions 
of the PowerPC architecture, bits 1 6-29 should be coded as zero and 
bit 30 should be coded as a 1 . 

This instruction is context synchronizing. 
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Table 3-33. System Linkage Instructions (Continued) 


Name 


Mnemonic 


Operand 
Syntax 


Operand Syntax 


Return 

from 

Interrupt 


rfl 




Bits 16-31 of SRR1 are placed into bits 16-31 of the MSR, then the 
next instruction is fetched, under control of the new l\/ISR value, from 
the address SRR0[0-29] || b'OO'. 

This instruction is a supervisor-level instruction and is context 
synchronizing. 



3.6.8 Simplified Mnemonics for Branch Processor instructions 

To simplify assembly language programming, a set of simplified mnemonics and symbols 
is provided that defines simple shorthand for the most frequently used forms of branch 
conditional, compare, trap, rotate and shift, and certain other instructions. 

Mnemonics are provided so that branch conditional instructions can be coded with the 
condition as part of the instruction mnemonic rather than as a numeric operand. Some of 
these are shown as examples with the branch instructions. 

In some implementations the processor may keep a stack of the link register values most 
recentiy set by branch and link instructions, with the possible exception of the form shown 
below for obtaining the address of the next insti"uction. To benefit from this stack, the 
following programming conventions should be used. 

Let A, B, and Glue be programs. 

Obtaining the address of the next instruction- use the following form of branch and link. 

bcl 20,31,$+4 

Loop Counts- Keep them in the count register, and use one of the branch conditional 
instructions to decrement the count and to control branching (e.g., branching back to the 
start of a loop if the decremented counter value is non-zero). 

Computed GOTOs, Case Statements, Etc.-Use the count register to hold the address to 
branch to, and use the bcctr instruction with the link register option disabled (LK=0) to 
branch to the selected address. 

Direct Subroutine Linkage-Here A calls B and B returns to A. The two branches should be 
as follows: 

• A calls B: use a branch instruction that enables the link register (LK=1). 

• B returns to A: use the bclr instruction with the link register option disabled(LK=0) 
(the return address is in, or can be restored to, the link register). 
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Indirect Subroutine Linkage-Here A calls Glue, Glue calls B, and B returns to A rather than 
to Glue. (Such a calling sequence is common in linkage code used when the subroutine that 
the programmer wants to call, here B, is in a different module from the caller: the binder 
inserts "glue" code to mediate the branch.) The three branches should be as follows: 

• A calls Glue. Use a branch instruction that sets the link register with the link register 
option enabled(LK=l). 

• Glue calls B. Place the address of B in the count register, and use the bcctr 
instruction with the link register option disabled (LK={)). 

• B returns to A. Use the bclr instruction with the link register option disabled(LK=0) 
(the return address is in, or can be restored to, the link register). 

PowerPC-compliant assemblers provide the mnemonics and symbols listed here and 
possibly others. Programs written to be portable across various assemblers for the PowerPC 
architecture should not assume the existence of mnemonics not defined here. 

3.6.9 Trap Mnemonics 

The trap instructions shown in Table 3-34 are provided to test for a specified set of 
conditions. If any of the conditions tested by a trap instruction are met, the system trap 
handler is invoked. If the tested conditions are not met, instruction execution continues 
normally. 

Table 3-34. Trap Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operand Syntax 


Trap Word 
Immediate 


twi 


TO,rA,SIMM 


The contents of rA is compared with the sign-extended SIMIVl 
operand. If any bit in the TO operand Is set to 1 and its corresponding 
condition is met by the result of the comparison, then the system trap 
handler is invoked. 


Trap Word 


tw 


TO,rA,rB 


The contents of rA is compared with the contents of rB. If any bit in 
the TO operand is set to 1 and its corresponding condition is met by 
the result of the comparison, then the system trap handler is invoked. 



The trap instructions evaluate a trap condition as follows: 

The contents of register rA is compared with either the sign-extended SIMM field or with 
the contents of register rB, depending on the trap instruction. The comparison results in 
five conditions which are ANDed with operand TO. If the result is not 0, the trap exception 
handler is invoked. These conditions are provided in Table 3-35. 
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Table 3-35. TO Operand Bit Encoding 



TO Bit 


ANDed with Condition 





Less than 


1 


Greater than 


2 


Equal 


3 


Logically less than 


4 


Logically greater than 



A standard set of codes has been adopted for the most common combinations of trap 
conditions, as shown in Table 3-36. The mnemonics defined in Table 3-37 are variations of 
the trap instructions, with the most useful values of the trap instruction TO operand 
represented as a mnemonic rather than specified as a numeric operand. 

Table 3-36. Trap Mnemonics Coding 



Code 


Meaning 


TO Operand 
Encoding 


< 


> 


= 


<u 


>u 


It 


Less than 


16 


1 














le 


Less than or equal 


20 


1 





1 








eq 


Equal 


4 








1 








ge 


Greater than or equal 


12 





1 


1 








gt 


Greater than 


8 





1 











n! 


Not less than 


12 





1 


1 








ne 


Not equal 


24 


1 


1 











ng 


Not greater than 


20 


1 





1 








lit 


Logically less than 


2 











1 





lie 


Logically less than or equal 


6 








1 


1 





Ige 


Logically greater than or 
equal 


5 








1 





1 


igt 


Logically greater than 


1 














1 


in! 


Logically not less than 


5 








1 





1 


Ing 


Logically not greater than 


6 








1 


1 





(ncnG) 


Unccnditiond 


21 


1 


1 

... 


1 


1 


1 



Note: <U indicates an unsigned less than evaluation will be performed. 
>U indicates an unsigned greater than evaluation will be performed. 



MOTOROLA 



Chapter 3. Addressing Modes and Instruction Set Summary 



3-79 



These codes are reflected in the mnemonics shown in Table 3-37. 

Table 3-37. Trap Mnemonics 



Trap Semantics 


32-Bit Comparison 


twi Immediate 


tw Register 


Trap unconditionally 


— 


trap 


Trap if less than 


twiti 


twit 


Trap if less than or equal 


twiei 


twie 


Trap if equal 


tweqi 


tweq 


Trap if greater tfian or equal 


twgei 


twge 


Trap if greater than 


twgti 


twgt 


Trap if not less than 


twnii 


twnl 


Trap if not equal 


twnei 


twne 


Trap if logically less than 


twiltl 


twilt 


Trap if logically less than or equal 


twiiei 


twile 


Trap if logically greater than or equal 


twilgi 


twilg 


Trap if logically greater than 


twilgi 


twilg 


Trap if logically not less than 


twinii 


twini 



Examples: 

• Trap if Rx, considered as a 32-bit quantity, is logically greater than x'7FF'. 
twig rA, x*7FF' (equivalent to twi l,rA, x'7FF') 

• Trap unconditionally. 

trap (equivalent to tw 31,0,0) 

3.7 Processor Control Instructions 

Processor control instructions are used to read from and write to the machine state register 
(MSR), special purpose registers (SPRs) and condition register (CR). 

3.7.1 Move to/from Special Purpose Register Instructions 

The MPC601 defines an additional register (MQ register) to the user register set and 
programming model. As a result, the mtspr and mfspr instructions have been extended to 
accommodate access to the MQ register for the MPC601. The SPR encoding for the MQ 
register is b'(X)00()00()()()'. 

The MPC601 also allows user-level read access to the decrementer register (DEC). The 
SPR encoding for DEC is b'OOllO 00000' and is valid only for the mfspr instruction. 
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During the execution of the mtspr instruction, the MPC601 does not fully decode SPR 
values for the XER, DEC, LR, MQ, CTR, RTCL or RTCU registers. Similarly, during the 
execution of the mfspr, the MPC601 does not fully decode the SPR values for the XER, 
LR, MQ, or CTR registers. Instead, it only decodes the upper five bits of the SPR field and 
assumes that tlie lower five bits are cleared to zeros. The PowerPC architecture defines the 
mfspr and mtspr instructions with the condition register update option enabled to leave 
condition register field CRO undefined. In this case, the MPC601 sets condition register 
field CRO to an undefined value. Move to/from Special Purpose Register instructions are 
listed in Table 3-38. For more information see Chapter 10, "Instruction Set." 

Simplified mnemonics are provided for the mtspr and mfspr instructions so they can be 
coded with the SPR name as part of the mnemonic rather than as a numeric operand. Some 
of these are shown as examples with the two instructions. 

Table 3-38. Move to/from Special Purpose Register Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Move to 
Special 
Purpose 
Register 


mtspr 


SPR,rS 


The SPR field denotes a special purpose register, encoded as shown 
in Table 3-39 and Table 3-40 below. The contents of rS are placed 
into the designated SPR. 

Simplified mnemonic examples: 
mtxerrA mtspr 1,rA 
mtir rA mtspr 8,rA 
mtctr rA mtspr 9,rA 


Move from 
Special 
Purpose 
Register 


mfspr 


rD.SPR 


The SPR field denotes a special purpose register, encoded as shown 
in Table 3-39 and Table 3-40 below. The contents of the designated 
SPR are placed into rD. 

Simplified mnemonic examples: 
mfxer rA mfspr rA,1 
mfir rA mfspr rA,8 
mfctr rA mfspr rA,9 


Move to 
Condition 
Register 
Fields 


mtcrf 


CRM.rS 


The contents of rS are placed into the condition register under control 
of the field mask specified by operand CRM. The field mask identifies 
the 4-bit fields affected. Let /be an integer in the range 0-7. If 
CRM(/) = 1, then OR field /(OR bits 4*/ through 4*h-3) is set to the 
contents of the corresponding field of rS. 

In some PowerPC implementations, this instruction may perform 
more slowly when only a portion of the fields are updated as opposed 
to all of the fields. This is not true for the MPC601 . 


Move to 
Condition 
Register 
from XER 


mcrxr 


crfD 


The contents of XER[0-3] are copied into the condition register field 
designated by erf D. All other fields of the condition register remain 
unchanged. XER[0-3] is cleared to 0. 


Move from 

Condition 

Register 


mfcr 


rD 


The contents of the condition register are placed into rD. 
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Table 3-38. Move to/from Special Purpose Register Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Move to 
Machine 
State 
Register 


mtmsr 


rS 


The contents of rS are placed into the MSR. 

This instruction is a supervisor-level instruction and is context 

synchronizing. 


Move from 
Machine 
State 
Register 


mfmsr 


rD 


The contents of the MSR are placed into rD. This is a supervisor- 
level instruction. 



For mtspr and mfspr instructions, the SPR number coded in assembly language does not 
appear directly as a 10-bit binary number in the instruction. The number coded is split into 
two 5-bit halves that are reversed in the instruction, with the high-order 5 bits appearing in 
bits 16-20 of the instruction and the low-order 5 bits in bits 1 1-15. 

Table 3-39 summarizes SPR encodings that the MPC601 recognizes when operating at the 
user level. 

Table 3-39. User-Level SPR Encodings 



Decimal 
Value in rD 


SPR[0-4] SPR[5-9] 


Register 
Name 


Description 





b'OOOOO 00000- 


MQ 


MQ register 


1 


b'00001 00000' 


XER 


Integer exception register 


8 


b'01 000 00000' 


LR 


Link register 


9 


b'01001 00000' 


CTR 


Count register 


4 


b'001 00 00000' 


RTCU 


Real- time clock upper 
register^ 


5 


b'00101 00000' 


RTCL 


Real- time clock lower register^ 


6 


b'001 10 00000' 


DEC 


Decrementer register ^ 



^ Read-only. 

2 Access to the DEC register is restricted to read-only while the 
processor is in the user-mode. User-level decrementer access is 
provided for POWER compatibility, and is specific to the MPC601 . 
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Table 3-40 summarizes SPR encodings that the MPC601 recognizes when operating at the 
supervisor level. 



Table 3-40. Supervisor-Level SPR Encodings 



Decimal 
Value in rD 


SPR[0-4] SPR[5-9] 


Register 
Name 


Description 


4 


b'001 00 00000' 


RTCU 


Real- time clock upper register 


5 


b'00101 00000' 


RTCL 


Real- time clock lower register 


18 


b'1 001 00000' 


DSISR 


DAE/Source instruction 
service register 


19 


b'10011 00000' 


DAR 


Data address register 


22 


b'1 0110 00000' 


DEC 


Decrementer register 


25 


b'1 1001 00000' 


SDR1 


Table search descriptor register 


26 


b'1 101 00000' 


SRRO 


Save and restore register 


27 


b'11011 00000' 


SRR1 


Save and restore register 1 


272 


b'1 0000 01000' 


SPRGO 


SPR general 


273 


b'1 0001 01000' 


SPRG1 


SPR general 1 


274 


b'1 001 01000' 


SPRG2 


SPR general 2 


275 


b'10011 01000' 


SPRG3 


SPR general 3 


282 


b'1 101 01000' 


EAR 


External access register 


287 


b'11111 01000' 


PVR 


Processor version register 


528 


b'1 0000 10000' 


BATOU 


Instruction BAT upper 


529 


b'1 0001 10000' 


BATOL 


Instruction BAT lower 


530 


b'1 001 10000' 


BAT1U 


Instruction BAT 1 upper 


531 


b'10011 10000' 


BAT1L 


Instruction BAT 1 lower 


532 


b'1 01 00 10000' 


BAT2U 


Instruction BAT 2 upper 


533 


b'1 01 01 10000' 


BAT2L 


Instruction BAT 2 lower 


534 


b'1 011 10000' 


BAT3U 


Instruction BAT 3 upper 


535 


b'1 0111 10000' 


BAT3L 


Instruction BAT 3 lower 


1008 


b'10000 11111' 


Checkstop 
(HI DO) 


Checkstop sources and 
enables register 


1009 


b'10001 11111' 


Debug 
(HID1) 


Debug modes register 


1010 


b'10010 11111' 


lABR 
(H1D2) 


Instruction address breakpoint 
register 


1013 


b'10101 11111' 


DABR 
(HID 5) 


Data address breakpoint 
register 
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Table 3-40. Supervisor-Level SPR Encodings (Continued) 



Decimal 
Value in rD 


SPR[0-4] SPR[5-9] 


Register 
Name 


Description 


1023 


b'11111 11111' 


PIR 
(HID15) 


Processor identification 
register 



If tlie SPR field contains any value other than one of these implementation-specific values 
or one of the values shown in Table 3-40, the instruction form is invalid. For an invalid 
instruction form in which SPR[0]=1 , the system supervisor-level instruction error handler will 
be invoked if the instruction is executed by a user-level program. If the instruction is 
executed by a supervisor-level program, the result is a no-op. 

SPR[0]=1 if and only if writing the register is supervisor-level. Execution of this instruction 
specifying a defined and supervisor-level register when MSR[PR]=1 results in a privilege 
violation type program exception. 

SPR encodings for the DEC, f\/IQ, RTCL and RTCU registers are not part of the PowerPC 
architecture. 

The PVR (processor version register) is a read-only register. 

Note: For compatibility with future versions of this architecture, only SPR numbers 
discussed in these instruction descriptions should be used. 

The mtspr and mfspr instructions specify a Special Purpose Register (SPR) as a numeric 
operand. Simplified mnemonics are provided that represent the SPR in the mnemonic rather 
than requiring it to be coded as an operand. Table 3-41 below specifies the simplified 
mnemonics provided on the MPC601 for SPR operations. 



Table 3-41. SPR Simplified Mnemonics 



Special Purpose 
Register 


Move to SPR 
Simplified 
Mnemonic 


Move to SPR 
Instruction 


MovefromSPR 
Simplified 
Mnemonic 


Move from SPR 
Instruction 


Integer unit exception 
register 


mtxer rS 


mtspr 1,rS 


mfxer rD 


mfspr rD,1 


Link register 


mtir rS 


mtspr 8,rS 


mfir rD 


mfspr rD,8 


Count register 


mtctr rS 


mtspr 9, rS 


mfctr rD 


mfspr rD,9 


DAE/source instruction 
service register 


mtdsisr rS 


mtspr 18,rS 


mfdsisr rD 


mfspr rD, 18 


Data address register 


mtdar rS 


mtspr 19,rS 


mfdar rD 


mfsprrD,19 


Decrementer 


mtdec rS 


mtspr 22,rS 


mfdec rD 


mfspr rD,22 


Table search descriptor 
register 1 


mtsdrl rS 


mtspr 25,rS 


mfsdrl rD 


mfspr rD,25 


Status save/restore 
register 


mtsrrO rS 


mtspr 26, rS 


mfsrrO rD 


mfspr rD,26 


Status save/restore 
register 1 


mtsrri rS 


mtspr 27,rS 


mfsrri rD 


mfspr rD,27 
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Table 3-41. SPR Simplified Mnemonics 



Special Purpose 
Register 


Move to SPR 
Simplified 
Mnemonic 


Move to SPR 
Instruction 


MovefromSPR 
Simplified 
Mnemonic 


MovefromSPR 
Instruction 


General special 
purpose registers GO 
ttirough G3 


mtsprg n, rS 


mtspr272+n,rS 


mfsprg rD, n 


mfspr rD,272+n 


External access register 


mtear rS 


mtspr 282,rS 


mfear rD 


mfspr rD,282 


Processor version 
register 


- 


- 


mfear rD 


mfspr rD,287 


BAT register, upper 


mtibatu n, rS 


mtspr 528+(2*n),rS 


mfibatu rD, n 


mfspr rD,528+(2*n) 


Bat register, lower 


mtibati n, rS 


mtspr 529+ (2*n),rS 


mfibati rD, n, 


mfspr rD,529+(2*n) 



3.8 Memory Control Instructions 

This section describes memory control instructions, which include the following: 

• Cache management instructions 

• Segment register manipulation instructions 

• Translation lookaside buffer management instructions 

3.8.1 Supervisor-Level Cache Management Instruction 

This section summarizes the operation of the only supervisor-level cache management 
instruction implemented on the MPC601. 
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Table 3-42. Cache Management Supervisor-Level Instruction 



Name 



Data 
Cache 
Block 
Invalidate 



Mnemonic 



dcbi 



Operand 
Syntax 



rA.rB 



Operation 



The effective address is the sum (rA|0)+(rB). 

The action taken depends on the memory mode associated with the 
target, and the state (modified, unmodified) of the block. The 
following list describes the action to take if the block containing the 
byte addressed by the EAis or is not in the cache. 

Coherency Required (Wlf^ = xx1 ) 

— Unmodified Block — Invalidates copies of the block in the 
caches of all processors. 

— [Modified Block — Invalidates copies of the block in the 
caches of all processors. (Discards the modified 
contents.) 

— Absent Block — If copies are in the caches of any other 
processor, causes the copies to be invalidated. 
(Discards any modified contents.) 

Coherency Not Required (WIM = xxO) 

— Unmodified Block — Invalidates the block in the local 
cache. 

— Modified Block — Invalidates the block in the local cache. 
(Discards the modified contents.) 

— Absent Block — No action is taken. 

When data address translation is enabled, [^SR[DT]=1 , and the 
logical address has no translation, a data access exception occurs. 
See Section 5.4.3, "Data Access Exception (x'00300')." 

The function of this instruction is independent of the write-through 
and cache-inhibited/allowed modes determined by the WIM bit 
settings of the block containing the byte addressed by the EA. 

This instruction is treated as a store to the addressed byte with 
respect to address translation and protection. The reference and 
change bits are modified appropriately. 

If the EA specifies a memory address for which T=1 in the 
corresponding segment register, the instruction is treated as a no-op. 



3.8.2 User-Level Cache Instructions 

The instructions summarized in this section provide user-level programs the ability to 
manage the MPC6()1 's unified cache. Note that the term block in the context of the on-chip 
cache refers to a sector within the cache (and not a block defined by the block address 
translation (BAT) mechanism). 

As with other memory-related instructions, the effect of the cache instructions on memory 
are weakly consistent. If the programmer needs to ensure that cache or other instructions 
have been performed with respect to all other processors and mechanisms, a sync 
instruction must be placed in the program following those instructions. 
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When data address translation is disabled (MSR[DT]=0), the Data Cache Block Set to Zero 
(dcbz) instruction allocates a line in the cache and may not verify that the physical address 
is valid. If a line is created for an invalid physical address, a machine check condition may 
result when an attempt is made to write that line back to memory. The line could be written 
back as the result of the execution of an instruction that causes a cache miss and the invalid 
addressed line is the target for replacement or a Data Cache Block Store (debst) instruction. 

Any cache control instruction that generates an effective address that corresponds to an I/O 
controller interface segment (SR[T]=1) that has the SR[BUID] field equal to x'07F' 
translates the address appropriately and performs the cache operation based on that address. 
A cache control instruction that generates an effective address that corresponds to an I/O 
controller interface segment (SR[T]=1), but with the SR[BUID] not equal to x'()7F' is 
treated as a no-op. 

Since the MPC601 is implemented with a unified (combined instruction and data) cache, 
the Instruction Cache Block Invalidate (icbi) instruction is treated as a no-op by the 
MPC601 processor. Table 3-43 summarizes the cache instructions that are accessible to 
user-level programs. 

Table 3-43. User-Level Cache Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Data 
Cache 
Block Touch 


debt 


rA,rB 


The EA is the sum (rA10)+(rB). 

This instruction provides a method for improving performance 
through the use of softv\/are-initiated prefetch hints. The MPG601 
performs the fetch for the cases when the address hits in the UTLB or 
the BTLB, and when it is permitted load access from the addressed 
page. The operation is treated similarly to a byte load operation with 
respect to coherency. 

If the address translation does not hit in the UTLB or BTLB, or if it 
does not have load access permission, the instruction is treated as a 
no-op. 

If the access is directed to a cache-inhibited page, or to an I/O 
controller interface segment, then the bus operation occurs, but the 
cache is not updated. 

This instruction never affects the reference or change bits in the 
hashed page table. 

While the f\/1PC601 maintains a cache line size of 64 bytes, the debt 
instruction may only result in the fetch of a 32-byte sector (the one 
directly addressed by the EA). The other 32-byte sector in the cache 
line may or may not be fetched, depending on activity in the dynamic 
memory queue. 

A successful debt instruction will affect the state of the TLB and 
cache LRU bits as defined by the LRU algorithm. 


Data 
Cache 
Block 
Touch for 
Store 


dcbtst 


rA.rB 


The EA is the sum (rA|0)+(rB). 

The dcbtst instruction operates exactly like the debt instruction as 
implemented on the MPC601. 
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Table 3-43. User-Level Cache Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Cache Line 

Compute 

Size 


cics 


rD.rA 


This is a POWER instruction, and is not part of the PowerPC 
architecture. This instruction will not be supported by other 
PowerPC implementations. 

This instruction places the cache line size specified by operand rA 
into register rD. The rA operand is encoded as follows: 

01100 Instruction cache line size (returns value of 64) 

01 1 01 Data cache line size (returns value of 64) 

01110 Minimum line size (returns value of 64) 

01111 [Maximum line size (return value of 64) 

All other encodings of the rA operand return undefined values. 
This instruction is specific to the fv1PC601 . 


Data 
Cache 
Block Set 
to Zero 


dcbz 


rA.rB 


The EA is the sum (rA|0)+(rB). 

If the block (the cache sector consisting of 32 bytes) containing the 
byte addressed by the EA is in the data cache, all bytes are cleared 
too. 

If the block containing the byte addressed by the EA is not in the data 
cache and the corresponding page is caching-allowed, the block is 
established in the data cache without fetching the block from main 
memory, and all bytes of the block are cleared to 0. 

If the page containing the byte addressed by the EA is caching- 
inhibited or write-through, then the system alignment exception 
handler is invoked. 

If the block containing the byte addressed by the EA is in coherence 
required mode, and the block exists in the data cache(s) of any other 
processor(s), it is kept coherent in those caches. 

The dcbz instruction is treated as a store to the addressed byte with 
respect to address translation and protection. 

If the EA corresponds to an I/O controller interface segment 
(SR[T]=1), the dcbz instruction is treated as a no-op. 


Data 
Cache 
Block Store 


dcbst 


rA,rB 


The EAis the sum(rA|0)+(rB). 

If the block (the cache sector consisting of 32 bytes) containing the 
byte addressed by the EAis in coherence required mode, and a block 
containing the byte addressed by the EA is in the data cache of any 
processor and has been modified, the writing of it to main memory is 
initiated. 

The function of this instruction is independent of the write-through 
and cache-inhibited/allowed modes of the block containing the byte 
addressed by the EA. 

This instruction is treated as a load from the addressed byte with 
respect to address translation and protection. 

If the EA corresponds to an I/O controller interface segment 
(SR[T]=1 ), the dcbst instruction is treated as a no-op. 
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Table 3-43. User-Level Cache Instructions (Continued) 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Data 
Cache 
Block Flush 


dcbf 


rA.rB 


The EAis the sum (rA|0)+(rB). 

The action taken depends on the memory mode associated with the 
target, and on the state of the block. The following list describes the 
action taken for the various cases, regardless of whether the page or 
block containing the addressed byte is designated as write-through or 
if it is in the caching-inhibited or caching-allowed mode. 
Coherency Required (WIM = xx1) 

— Unmodified Block— Invalidates copies of the block in the 
caches of all processors. 

— Modified Block— Copies the block to memory. 
Invalidates copies of the block in the caches of all 
processors. 

— Absent Block— If modified copies of the block are in the 
caches of other processors, causes them to be copied to 
memory and invalidated. If unmodified copies are in the 
caches of other processors, causes those copies to be 
invalidated. 

Coherency Not Required (WIM = xxO) 

— Unmodified Block — Invalidates the block in the 
processor's cache. 

— Modified Block— Copies the block to memory. 
Invalidates the block in the processor's cache. 

— Absent Block— Does nothing. 



3.8.3 Segment Register Manipulation Instructions 

The instructions listed in Table 3-44 provide access to the segment registers of the 
MPC601. These instructions operate completely independently of the MSR[IT] and 
MSR[DT] bit settings. Note that the rA operand is not defined for the mtsrin and mfsrin 
instructions in the MPC601. Refer to Section 2.3.3.1, "Synchronization for Supervisor- 
Level SPRs, and Segment Registers," for serialization requirements and other 
recommended precautions to observe when manipulating the segment registers. 
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Table 3-44. Segment Register Manipulation Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


Move to 

Segment 

Register 


mtsr 


SR,rS 


The contents of rS is placed into segment register specified by 
operand SR. 

This is a supervisor-level instruction. 


Move to 
Segment 
Register 
indirect 


mtsrin 


rS.rB 


The contents of rS are copied to the segment register selected by bits 
0-3 of rB. 

This is a supervisor-level instruction. 


Move from 

Segment 

Register 


mfsr 


rD.SR 


The contents of the segment register specified by operand SR are 
placed into rD. 

This is a supervisor-level instruction. 


Move from 
Segment 
Register 
Indirect 


mfsrin 


rD.rB 


The contents of the segment register selected by bits 0-3 of rB are 
copied into rD. 

This is a supervisor-level instruction. 



3.8.4 Translation Look-Aside Buffer Management Instructions 

The MPC601 implements a TLB that caches portions of the page table. As changes are 
made to the address translation tables, the TLB must be updated. This is done by explicitly 
invalidating TLB entries (both in the set) with the Translation Lookaside Buffer Invalidate 
Entry (tibie) instruction. 

Because the presence, absence, and exact semantics of various translation lookaside buffer 
management instructions are implementation dependent, system software should 
encapsulate uses of such instructions into subroutines to minimize the impact of migrating 
from one implementation to another. 

3.9 External Control Instructions 

The external control instructions provide a means for a user-level program to communicate 
with a special -purpose device. Two instructions are provided and are summarized in 
Table 3-46. 
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Table 3-45. Translation Lookaside Buffer Management Instruction 



Name 



Translation 

Lookaside 

Buffer 

Invalidate 

Entry 



Mnemonic 



tibie 



Operand 
Syntax 



rB 



Operation 



The effective address is the contents of rB. If the TLB contains an 
entry corresponding to the EA, that entry is removed from the TLB. 
The TLB search is done regardless of the settings of MSR[IT] and 
MSR[DT]. Also, a TLB invalidate operation is broadcast on the 
system bus. 

Block address translation for the EA, if any, is ignored. 

If the corresponding segment register for the EA specifies T=1 (an I/O 
controller interface segment), no TLB entry invalidation is performed 
on the local processor and no TLB invalidate is broadcast. 

Because the MPC601 supports broadcast of TLB entry invalidate 
operations, the following must be observed: 

• The tIbie instruction must be contained in a critical section of 
memory controlled by software locking, so that the tibie is issued 
on only one processor at a time. 

• A sync instruction must be issued after every tibie and at the end 
of the critical section. This causes hardware to wait for the effects 
of the preceding tibie instructions(s) to propagate to ail 
processors. 

A processor detecting a TLB invalidate broadcast does the following: 

1. Prevents execution of any new load, store, cache control or 
tibie instructions and prevents any new reference or change 
bit updates 

2. Waits for completion of any outstanding memory operations 
(including updates to the reference and change bits 
associated with the entry to be invalidated) 

3. Invalidates the two entries (both associativity classes) in the 
UTLB indexed by the matching address 

4. Resumes normal execution 

This is a supervisor-level instruction. 

Software must ensure that SDR 1 points to the page table when 
issuing tibie, even when address translation is disabled. Nothing is 
guaranteed about instruction fetching in other processors if tibie 
deletes the page in which another processor is executing. 



3.10 Miscellaneous Simplified Mnemonics 

In order to make assembly language programs simpler to write and easier to understand, a 
set of simplified mnemonics are provided that define a shorthand for some of the most 
frequently used instructions. PowerPC compliant assemblers provide the simplified 
mnemonics listed here, and in the sections describing the branch, arithmetic, compare, trap, 
rotate and shift, and move to/from special purpose register instructions. Programs written 
to be portable across the various assemblers for the PowerPC architecture should not 
assume the existence of mnemonics not defined in this user's manual. 
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Table 3-46. External Control Instructions 



Name 


Mnemonic 


Operand 
Syntax 


Operation 


External 
Control in 
Word 
Indexed 


eciwx 


rD.rA.rB 


The EAis the sum (rA|0)+(rB). 

If the external access register (EAR) E-bit (bit 0) is set to 1 , a load 
request for the physical address corresponding to the EA is sent to 
the device identified by the EAR Resource ID bits (bits 26-31), 
bypassing the cache. The word returned by the device is placed in 
rD. The EA sent to the device must be word aligned. 

If the EAR[E]=0, a data access exception is invoked, with bit 11 of 
DSISRsettoL 

The eciwx instruction is supported for EAs that reference ordinary 
memory segments (SR|T]=0), for EAs mapped by BAT registers, and 
for EAs generated when MSR[DT]=O.The instruction is treated as a 
no-op for EAs in I/O controller interface segments (SR[T]=1 ). 

The access caused by this instruction is treated as a load from the 
location addressed by the EA with respect to protection and 
reference and change recording. 


External 
Control out 
Word 
Indexed 


ecowx 


rS.rA.rB 


The EAis the sum (rA|0)+(rB). 

If the External Access Register (EAR) E-bit (bit 0) is set to 1 , a store 
request for the physical address corresponding to the EA and the 
contents of rS are sent to the device identified by EAR[RID] (resource 
ID) (bits 26-31), bypassing the cache. The EA sent to the device must 
be word aligned. 

If the EAR[E]=0, a data access exception is invoked, with bit 11 of 
DSISR set to 1 . 

The ecowx instruction is supported for EAs that reference ordinary 
memory segments (SR(T]=0), for EAs mapped by BAT registers, and 
for EAs generated when MSR[DT]=O.The instruction is treated as a 
no-op for EAs in I/O controller interface segments (SR[T]=1). 

The access caused by this instruction is treated as a load from the 
location addressed by the EA with respect to protection and 
reference and change recording 



3.10.1 No-op 

Many PowerPC instructions can be coded in a way such that, effectively, no operation is 
performed. An additional mnemonic is provided for the preferred form of no-op. If an 
implementation performs any type of run-time optimization related to no-ops, the preferred 
form is the no-op that will trigger this. 



no-op 



(equivalent to ori 0,0,0) 



3.10.2 Load Immediate 

The add! and addis instructions can be used to load an immediate value into a register. 
Additional mnemonics are provided to convey the idea that no addition is being performed 
but that data us being moved from the immediate operand of the instruction to a register. 
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Load a 1 6-bit signed immediate value into r A: 

li r A, value (equivalent to addi rA,0, value) 

Load a 16-bit signed immediate value, shifted left by 16 bits, into rA: 
lis r A, value (equivalent to addi rA,0,value) 

3.10.3 Load Address 

This mnemonic permits computing the value of a base-displacement operand, using the 
addi instruction which normally requires a separate register and immediate operands. 

la rD,SIMM(rA) (equivalent to addi rD,rA,SIMM) 

The la mnemonic is useful for obtaining the address of a variable specified by name, 
allowing the assembler to supply the base register number and compute the displacement. 
If the variable v is located at offset SIMMv bytes from the address in register rv, and the 
assembler has been told to use register rv as a base for references to the data structure 
containing v, then the following line causes the address of v to be loaded into register rD. 

la rD,v (equivalent to addi rD,rA,SIMMv 

3.10.4 Move Register 

Several PowerPC instructions can be coded to simply copy the contents of one register to 
another. An extended mnemonic is provided to move data from one register to another with 
no computational activity. 

The following instruction copies the contents of register rS into register rA. This 
mnemonic can be coded with a "." to cause the condition register update option to be 
specified in the underlying instruction. 

mr rA,rS (equivalent to or rA,rS,rB) 

3.10.5 Complement Register 

Several PowerPC instructions can be coded to complement the contents of one register and 
place the result in another register. A simplified mnemonic is provided that complements 
the contents of rS and places the results into register rA. This mnemonic can be coded with 
a "." to cause the condition register update option to be specified in the underlying 
instruction. 

not rA,rS (equivalent to nor rA,rS,rB) 
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Chapter 4 

Cache and Memory Unit Operation 

The MPC601 contains a 32-Kbyte, eight-way set associative, unified (instruction and data) 
cache. The cache line size is 64 bytes, divided into two eight-word sectors, each of which 
can be snooped, loaded, cast-out, or invalidated independently. The cache is designed to 
adhere to a write-back policy, but the MPC601 allows control of cacheability, write policy, 
and memory coherency at the page and block level. The cache uses a least recently used 
(LRU) replacement policy. 

The MPC601 's on-chip cache is non-blocking. Burst operations to the cache are buffered 
such that the cache update is reduced to two single-cycle operations of four words. That is, 
the results of the first two and the last two bursts are buffered and written to the cache in 
single cycles apiece. This frees the cache to perform lower priority operations in the 
meantime. 

System operations, including cache operations, connect to the system interface through the 
memory unit, which includes a two-element read queue and a three-element write queue. 

The cache provides an eight-word interface to the rest of the device. The surrounding logic 
selects, organizes, and forwards the requested information to the requesting unit. Write 
operations to the cache can be performed on a byte basis, and a complete read-modify-write 
operation to the cache can occur in each cycle. 

The cache unit and the memory unit coordinate cache reload and cast-out operations so that 
a cache miss does not block the use of the cache for other operations during the next cycle. 
Cache reload operations always occur on a sector basis, with the option of reloading the 
additional sector as a low-priority operation. On loads and fetch operations, the critical data 
is forwarded to the requesting unit without waiting for the entire cache line to be loaded. 

The MPC601 maintains cache coherency in hardware by coordinating activity between the 
cache, the memory unit, and the bus interface logic. As bus operations are performed on the 
bus by other devices, the MPC601 bus snooping logic monitors the addresses that are 
referenced. These addresses are compared with the addresses resident in the cache. The 
cache unit uses a second port into its tag directory to check for a matching entry and the 
memory queue unit does the same. If there is a snoop hit, the MPC6()1 's bus snooping logic 
responds to the bus interface with the appropriate snoop status. An additional snoop action 
may be forwarded to the cache or to the memory unit as a result of a snoop hit. 
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Note that in this chapter the term multiprocessor is used in the context of maintaining cache 
coherency, although the system could include other devices that can access system memory, 
maintain their own caches, and function as bus masters requiring cache coherency. 

This chapter describes the organization of the MPC6()rs on-chip cache, the MESI cache 
coherency protocol, special concerns for cache coherency in single- and multiple-processor 
systems, cache control instructions, various cache operations, and the interaction between 
the cache and the memory unit, 

4.1 Cache Organization 

The cache is configured as eight sets of 64 lines. Each line consists of two sectors, four state 
bits (two per sector), an address tag, and several bits to maintain the LRU function. The two 
state bits implement the four-state MESI (modified-exclusive-shared-invalid) protocol. 
Each sector contains eight 32-bit words. Note that PowerPC architecture defines the 
cacheable unit as a block, which is a sector in the MPC601. 

The instruction unit accesses the cache frequently in order to maintain the flow of 
instructions through the instruction queue. The queue is eight words (one sector) long, so 
an entire sector can be loaded into the instruction unit on a single clock cycle. 

The cache organization is shown in Figure 4-1. Note that the replacement algorithm is 
strictly an LRU algorithm; that is, the least recently used sector is used, which may mean 
that a modified sector will be replaced on a miss if it is the least recently used, even if 
invalid sectors are available. However, for performance reasons, certain conditions (for 
example, the execution of some cache instructions) generate accesses to the cache without 
modifying the bits that perform the LRU function. 

Each cache line contains 16 contiguous words from memory that are loaded from a 
16-word boundary (that is, bits A26-A31 of the logical addresses are zero); as a result, 
cache lines are aligned with page boundaries. 

Note that address bits A2()-A25 provide an index to select a line. Bits A26-A31 select a 
byte within a line. The tags consists of bits PA()-PA19. Address translation occurs in 
parallel, such that higher-order bits (the tag bits in the cache) are physical. 
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Figure 4-1. Cache Organization 

4.2 Cache Arbitration 

The instruction unit and the integer unit both access the cache; however, the cache unit 
handles only one access per cycle. Furthermore, since the cache is nonblocking, a 
preceding cache operation may generate a cache reload operation which must also compete 
for cache access. The bus snooping logic may create additional snoop actions that use the 
cache. The MPC601 efficiently handles simultaneous requests to access the on-chip cache. 

The MPC601 implements cache arbitration logic to prioritize the various cache requests 
that can occur on each cycle. The cache unit provides a cache retry queue (CRTRY) if a 
caching operation cannot be completed. There are three entries in this queue, providing a 
buffer for one outstanding floating-point store, one integer store, and one instruction fetch. 
Priority is given first to floating-point stores, then to integer stores, and finally to instruction 
fetches. 

A similar situation arises with respect to the bus. Internal bus arbitration logic chooses the 
highest priority operation from the memory queue for presentation onto the bus. These 
priorities are listed in Section 4.10.2, "Memory Unit Queuing Priorities." 
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The MPC601 supports a fully-coherent 4-Gbyte physical memory address space. Bus 
snooping is used to drive a MESI four-state cache-coherency protocol that ensures the 
coherency of all processor and DMA transactions to and from global memory with respect 
to the processor's cache. The MESI protocol is described in Section 4.7.2, "MESI 
Protocol." All potential bus masters must employ similar snooping and coherency-control 
mechanisms. 

4.3 Cache Access Priorities 

The MPC601 prioritizes pending cache operations as follows: 

1 . Cache reloads. Note that the cache is non-blocking. Four-beat burst reloads on the 
system bus are buffered into two, single- cycle transactions of four words each, 
freeing the cache to perform lower priority operations in the meantime. 

2. Second-cycle cast-out operations when the additional sector is modified 

3. Snoop requests that hit in the tag directory. These generate a cache sector push 
operation. 

4. Floating-point store operations 

5. Integer operation retries. If a higher priority operation occurs when an integer 
operation is ready to cache its results, the results are held in a buffer until the higher 
priority operation completes, then it is retried on the next clock cycle. This prevents 
the integer unit from stalling when this situation occurs. 

6. Integer unit requests 

7. Instruction fetches 

4.4 Basic Cache Operations 

This section describes operations that can occur to the cache, and how these operations are 
implemented in the MPC601. 

4.4.1 Cache Reloads 

A cache sector is reloaded after a read miss occurs in the cache. The cache sector that 
contains the address is updated by a burst transfer of the data from system memory. Note 
that if a read miss occurs in a multiprocessor system, and the data is modified in another 
cache, the modified data is first written to external memory before the cache reload occurs. 

An instruction prefetch that is generated to fill the instruction queue (not explicitly required 
by the program flow) does not generate a reload operation in the case of a cache miss. 

4.4.2 Cache Cast-Out Operation 

The MPC601 uses an LRU replacement algorithm to determine which of the eight possible 
cache locations should be used for a cache update. Adding a new sector to the cache causes 
any modified data associated with the least recently used element to be written back, or cast 
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out, to system memory. This includes both sectors of the line, even though only one sector 
may be reloaded. Casting out of the adjacent sector is referred to as a second-cycle cast-out 
operation, 

4.4.3 Cache Sector Push Operation 

When a cache sector in the MPC601 is snooped and hit by another processor and the data 
is modified, the cache sector must be written to memory and made available to the snooping 
device. The cache sector that is hit is said to be pushed out onto the bus. The MPC601 
supports two kinds of push operations — normal push operations and enveloped high- 
priority push operations, which are described in Section 4.7,11, "Enveloped High-Priority 
Cache Sector Push Operation," 

4.4.4 Optional Cache Sector Line-Fill Operation 

The two sectors in a cache line contain contiguous memory addresses; therefore, the two 
sectors share the same line address tag. Cache coherency, however, is maintained on a 
sector granularity, so there are separate coherency state bits for each sector. If one sector of 
the line is filled from memory, the MPC601 may attempt to load the other sector as a low- 
priority bus operation. 

If the sector is not transferred, the cache line in the snooping processor contains one sector 
that is in the shared state (the one that was transferred because of the snoop hit) and one 
sector that is invalid (if the optional cache line fill is not performed). 

Note that the optional reload of an adjacent sector on an instruction fetch miss can be 
disabled globally by .setting bit 26 in the HIDO register, and the optional reload of the 
adjacent sector on a load/store miss can be disabled by setting bit 27. 

4.5 Cache Data Transactions 

The MPC601 output signal TBST (transfer burst) indicates to the system whether the 
current transaction is a single-beat transaction or four-beat burst transfer. Burst transactions 
have an assumed address order. For cacheable load operations or cacheable, non-write- 
through store operations that miss the cache, the MPC601 presents the quad-word aligned 
address associated with the read or store that initiated the transaction, (Note that for 
optimizing programs to be used with subsequent PowerPC processors, programs should be 
double-word aligned.) 

As shown in Figure 4-2, this quad-word contains the address of the load or store that missed 
the cache. This minimizes latency by allowing the critical code or data to be forwarded to 
the processor before the rest of the sector is filled. For all other burst operations, however, 
the entire sector is transferred in order (oct-word aligned). 
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MPC601 Cache Address 
Bits (27..28) 
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If address requested is in double word A or B tiien the address placed on the bus are that of 
quad-word A, and the four data beats are ordered in the following manner: 

Beat 



A 


B 


C 


D 



If address requested is in double word C or D then the address placed on the bus will be that 
of quad-word C, and the four data beats are ordered in the following manner: 

Beat 



c 


D 


A 


B 



Figure 4-2. Quad-Word Address Ordering 

4.6 Access to I/O Controller Interface Segments 

The MPC601 supports two kinds of operations that involve the I/O controller interface: 

• I/O controller interface operations. These operations are considered to address the 
noncoherent and noncacheable I/O controller interface; therefore, the MPC601 does 
not maintain coherency for these operations, and the cache is bypassed completely. 

• Memory-forced I/O controller interface operations. These operations are considered 
to address memory space and are therefore subject to the same coherency control as 
memory accesses. These operations are global memory references within the 
MPC6()1 and are considered to be noncacheable and write-through. 

Cache behavior (write-back, cache-inhibition, and enforcement of MESI coherency) for 
these operations is determined by the settings of the WIM bits. See 6.3, "Memory/Cache 
Access Modes." 

4.7 Cache Coherency 

The primary objective of a coherent memory system is to provide the same image of 
memory to all devices using the system. Coherency allows synchronization, cooperative 
use of shared resources, and task migration among the processors. Otherwise, for example, 
a device performing a store operation would require exclusive access to the addressed 
sector before making an update to prevent another device from using stale data. 
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Each potential bus master must follow rules for managing the state of its cache. For 
example, a device must broadcast its intention to read a sector that is not currently in the 
cache. It must also broadcast the intention to write into a sector that is currently not owned 
exclusively. Other devices respond to these broadcasts by snooping their caches for the 
broadcast addresses and reporting status back to the originating device. The status returned 
includes a shared indicator (another device has a copy of the addressed sector) and a retry 
indicator (another device either has a modified copy of the addressed sector that it needs to 
push out of the chip, or another device had a problem that prevented appropriate snooping). 

For faster performance, the MPC6()1 has a second path into the cache directory so snooping 
and mainstream instruction processing occur concurrently. Instruction processing is 
interrupted only when the snoop control logic detects a state change or that a snoop push 
of modified data is required to maintain memory coherency. 

To maintain coherency, secondary caches must forward all relevant system bus traffic onto 
the MPC601 bus, which takes the appropriate actions to maintain the MESI protocol. 

Support for Iwarx and stwcx. instructions on noncacheable pages may be somewhat more 
complicated for a secondary cache than normal cacheable memory accesses. This is 
because the secondary cache may not normally forward writes to noncacheable pages in the 
processor. However, to maintain the reservation coherency bit, the secondary cache must 
know to forward all writes that hit against a specified address. 

4.7.1 Memory Management Access Mode Bits — W, I, and M 

Some memory characteristics can be set on either a block or page basis by using the WIM 
bits in the BAT registers or page table entry (PTE) respectively. The WIM bits control the 
following functionality: 

• Write-through (W bit) 

• Caching-inhibited (I bit) 

• Memory coherency (M bit) 

These bits allow both single- and multiprocessor-system designs to exploit numerous 
system-level performance optimizations. These bits are described in detail in Chapter 2, 
"Registers and Data Types," and Chapter 6, "Memory Management Unit." Using these bits 
carelessly can cause coherency problems — such as when flushing pages that correspond to 
the changed WIM bits from the caches of all devices in the system or when the address 
translations of aliased physical addresses specify different values for any of the WIM bits. 
The MPC601 considers either of these cases to be a programming error that may 
compromise memory coherency. These paradoxes can occur within a single processor or 
across several devices, as described in Section 4.7.5.1, "Coherency in Single-Processor 
Systems," and Section 4.7.5.2, "Coherency in Multiprocessor Systems." 
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4.7.2 MESI Protocol 

The MPC601 cache characterizes each 32-byte sector it contains as being in one of four 
MESI states. Addresses presented to the cache are indexed into the cache directory with bits 
A2()-A25 and the upper-order 20 bits from the physical address translation (PA0-PA19) are 
compared against the indexed cache directory tags. If no tags match, the result is a cache 
miss. If a tag matches, a cache hit occurred and the directory indicates the state of the sector 
through two state bits kept with the tag. The four possible states for a sector in the cache 
are the invalid state (I), the shared state (S), the exclusive state (E), and the modified state 
(M). The four MESI states are defined in Table 4-1 and illustrated in Figure 4-3. 

Table 4-1. MESI State Definitions 



MESI State 


Definition 


Modified (M) 


The addressed sector is valid in the cache and in only this cache. The sector is modified with 
respect to system memory — ^that is, the modified data in the sector has not been written back to 
memory. 


Exclusive (E) 


The addressed sector is in this cache only. The data in this sector is consistent with system 
memory. 


Shared (S) 


The addressed sector is valid in the cache and in at least one other cache. This sector is always 
consistent with system memory. That Is, the shared state is shared-exclusive; there is no shared- 
modified state. 


Invalid (1) 


This state indicates that the addressed sector is not resident in the cache. 
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Figure 4-3. MESI States 

4.7.3 MESI State Diagram 

The MPC601 provides dedicated hardware to provide memory coherency by snooping bus 
transactions. The address retry capability of the MPC601 enforces the MESI protocol, as 
shown in Figure 4-4. Figure 4-4 assumes that the WIM bits are set to 001; that is, write- 
back, caching-not-inhibited, and memory coherency enforced. 
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Table 4-7 gives a detailed list of MESI transitions for various operations and WIM bit 
settings. 




BUS TRANSACTIONS 



RH = Read Hit 

RMS = Read Miss, Shared 

RME = Read Miss, Exclusive 

WH = Write Hit 

WM = Write Miss 

SHR = Snoop Hit on a Read 

SHW = Snoop Hit on a Write or 

Read-with-lntent-to-Modify 



(T) = Snoop Push 

(^ = Invalidate Transaction 

^ = Read-with-lntent-to-Modify 

® 



Cache Sector Fill 



Figure 4-4. MESI Cache Coherency Protocol— State Diagram (WIM = 001) 
4.7.4 MESI Hardware Considerations 

In addition to the hardware required to monitor bus traffic for coherency, the MPC601 has 
a cache port dedicated to snooping so that comparing cache entries to address traffic on the 
bus does not affect the MPC601 's on-chip cache. 

The global (GBL) signal, asserted as part of the address attribute field, enables the snooping 
hardware of the MPC6()1. Address bus masters assert GBL to indicate that the current 
transaction is a global access (that is, an access to memory shared by more than one device). 
If GBL is not asserted for the transaction, that transaction is not snooped. 
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Normally, GBL reflects the M-bit value specified for the memory reference in the 
corresponding translation descriptor(s). Care must be taken to minimize the number of 
pages marked as global, because the retry protocol enforces coherency and can use 
considerable bus bandwidth if much data is shared. Therefore available bus bandwidth can 
decrease as more traffic is marked global. Note that in Figure 4-4, write hits to unmodified 
lines of nonglobal pages do not generate invalidate broadcasts. 

The MPC6()1 snoops a transaction if the transfer start (T5) and GBL inputs are asserted 
together in the same bus clock (this is a qualified snooping condition). No snoop update to 
the MPC601 cache occurs if the snooped transaction is not marked global. This includes 
invalidation cycles. 

When the MPC601 detects a qualified snoop condition, the address associated with the TS 
is compared with the cache tags through a dedicated cache-tag snoop port. Snooping 
finishes if no hit is detected. If, however, the address hits in the cache, the MPC6()1 reacts 
according to the MESI protocol shown in Figure 4-4. 

Because they do not require snooping, cache sector cast-outs, snoop pushes, and table- 
search operations do not assert GBL. The MPC6()1 marks these transactions as nonglobal. 

To facilitate external monitoring of the internal cache tags, the cache set member signals 
(CSE0-CSE2) represent in binary the sector of the cache set being replaced on read 
operations (including read-with-intent-to-modify operations). This does not apply and is 
not necessary for write operations to memory. Note that these signals are valid only for 
MPC601 burst operations. Table 6-2 shows the CSE encodings. 



Table 4-2. CSE0-CSE2 Signals 



CSE0-CSE2 


Cache Set Element 


000 


SetO 


001 


Setl 


010 


Set 2 


oil 


Set 3 


100 


Set 4 


101 


Sets 


110 


Set 6 


111 


Set? 



4.7.5 Coherency Precautions 

Cache coherency is greatly affected by whether the MPC601 is used in a single- or 
multiple-processor implementation. This section describes precautions for implementing 
coherent single- and multiple-processor systems. 
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4.7.5.1 Coherency in Single-Processor Systems 

The following situations concerning coherency can be encountered within a single 
processor implementation: 

• Load or store to a cache-inhibited page (WIM = b'Xl X') and a cache hit occurs 

Caching is inhibited for this page (1 = 1). Load or store operations to a cache- 
inhibited page that hit in the cache cause a paradox. If the addressed sector is not 
modified, the MPC601 invalidates the sector and performs the memory access. If the 
addressed sector in the cache line is modified, the MPC601 flushes the modified 
sector before accessing memory. 

• Store to a page marked write-through (WIM = b'l OX') and a cache hit to a modified 
sector 

This page is marked as write-through (W = 1). Store operations to a write-through 
page that hit a modified sector are considered coherency paradoxes by the processor. 
The MPC601 pushes the modified sector to memory and marks the sector exclusive 
(E). Then the MPC601 writes the data into the cache, marking it exclusive and 
passing on a write- with-flush operation (to the memory queue). 

Note that when WIM bits are changed, it is critical that the cache contents should reflect the 
new WIM bit settings. For example, if a block or page that had allowed caching becomes 
caching-inhibited, the appropriate cache sectors should be updated to leave no indication 
that caching had previously been allowed. 

4.7.5.2 Coherency in IViultiprocessor Systems 

Other situations concerning coherency can occur across multiple processors (or systems 
that employ multiple devices that incorporate caches). Paradoxes in multiprocessor 
systems are particularly difficult to handle since some scenarios cause modified data to be 
purged and others may lead to bus deadlock scenarios. 

Most multiprocessor paradoxes center around the interprocessor coherency of the memory 
coherency bit (the M bit). Improper use of the M bit can lead to multiple devices accepting 
a cache sector and marking the data as exclusive, leading to the possibility of the same 
cache line being modified in multiple caches. 
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Although these coherency paradoxes are considered programming errors, the MPC601 
attempts to handle the offending conditions and minimize the negative effects on memory^ 
coherency. Note that the intent of this effort is to ease the debugging of multiprocessor 
operating system development. The following lists some of the operations provided by the 
MPC601: 

• Noncacheable write operations appear on the processor bus as write- with-flush 
operations, which forces other processors with modified copies of the addressed 
sector to write data back to memory and to mark the sector as invalid in the cache. 
Devices with an unmodified copy of the sector must mark the sector as invalid in 
their caches. 

• All noncacheable read operations appear on the MPC601 bus as read (with clean) 
operations, which forces processors with modified copies of the addressed data to 
write the data back to memory before the read operation completes. 

Note that when WIM bits are changed, it is critical that the cache contents should reflect the 
new WIM bit settings. For example, if a block or page that had allowed caching becomes 
caching-inhibited, the appropriate cache sectors should be updated to leave no indication 
that caching had previously been allowed. 

Additional information on bus operations that are generated for specific instructions and 
state conditions can be found in Chapter 9, "System Interface Operation." 

4.7.6 Memory Loads and Stores 

Table 4-3 provides a general overview of memory coherency actions performed by the 
MPC6()lon load operations. 

Table 4-3. Memory Coherency Actions on Load Operations 



Cache State 


Bus 
Operation 


AHTRY 


SHD 


Action 


1 


Read 


Negated 


Negated 


Load data and mark E 


1 


Read 


Negated 


Asserted 


Load data and mark S 


1 


Read 


Asserted 


Don't care 


Retry read operation 


S 


None 


Don't care 


Don't care 


Read from cache 


E 


None 


Don't care 


Don't care 


Read from cache 


M 


None 


Don't care 


Don't care 


Read from cache 



Noncacheable cases are not part of this table. The first three cases also involve selecting a 
replacement class and casting-out modified data that may have resided in that replacement 
class. 

Table 4-4 provides an overview of memory coherency actions on store operations. This 
table does not include noncacheable or write-through cases nor does it completely describe 
the exact mechanisms for the operations described. It describes generally what happens 
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within the chip. The read-with-intent-to-modify (RWITM) examples involve selecting a 
replacement class and casting-out modified data that may have resided in that replacement 
class. 

Table 4-4. Memory Coherency Actions on Store Operations 



Cache State 


Bus 
Operation 


ARTRY 


SHD 


Action 


1 


RWITM 


Negated 


Don't care 


Load data, modify it, mark M 


1 


RWITM 


Asserted 


Don't care 


Retry the RWITM 


S 


Kill 


Negated 


Don't care 


Modify cache, mark M 


s 


Kill 


Asserted 


Don't care 


Retry the kill operation 


E 


None 


Don't care 


Don't care 


Modify cache, mark M 


M 


None 


Don't care 


Don't care 


Modify cache 



4.7.7 Atomic Memory References 

The Iwarx/stwcx. instruction combination can be used to perform atomic memory 
references. These instructions are described in Chapter 3, "Addressing Modes and 
Instruction Set Summary," and Chapter 10, "Instruction Set." 

4.7.8 Snoop Response to Bus Operations 

When the MPC601 is not the bus master, it monitors bus traffic and performs cache and 
memory-queue snooping as appropriate. The snooping operation is triggered by the receipt 
of a qualified snoop request. A qualified snoop request is generated by the simultaneous 
assertion of the TS" and UBL bus signals. 

Instruction processing is interrupted only when a snoop hit occurs and the snoop state 
machine determines that an additional cache snoop is required to resolve the coherency of 
the offended sector. 

The MPC601 maintains a write queue of bus operations in progress and/or pending 
arbitration. This write queue must also be snooped in response to qualified snoop requests. 



The MPC6()1 drives two snoop status signals (ARTRY and SHD) in response to a qualified 
snoop request that hits. These signals provide information about the state of the addressed 
sector for the current bus operation. These signals are described fully in Chapter 8, "Signal 
Descriptions." 

4.7.9 Cache Reaction to Specific Bus Operations 

There are several bus transaction types defined for the MPC601 bus. The MPC6()1 must 
snoop these transactions and perform the appropriate action to honor their intention to 
maintain memory coherency; see Table 4-5. 
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A processor may assert ARl'RY for any bus transaction due to internal conflicts that prevent 
the appropriate snooping. In general, if AR'I'RY is not asserted, each snooping processor 
must take full ownership for the effects of the bus transaction with respect to the state of 
the processor. The processor can assert ARTRY if an internal conflict prevents it from 
snooping properly. 

The transactions in Table 4-5 correspond to the transfer type signals TT0-TT4, which are 
described in Section 8.2.4.1, "Transfer Type (TT()-TT4)." 

Table 4-5. Response to Bus Transactions 



Transaction 


Response 


Clean block 


The clean operation is an address-only bus transaction, initiated by executing a dcbst 
instruction. This operation affects only sectors marked as modified (M). Assuming the 
tiBL signal is asserted, modified sectors are pushed out to memory, changing the state 
toE. 


Flush block 


The flush operation is an address-only bus transaction initiated by executing a debt , 
instruction. Assuming the GliL signal is asserted, the flush block operation results in the 
following: 

• If the addressed sector is shared or exclusive, an additional snoop action is 
generated internally that invalidates the addressed sector. 

• If the addressed sector is in the M state, ARTRY is asserted and an additional 
internally generated snoop action is initiated that pushes the modified sector out of the 
cache and invalidates the sector. 

• If HID0[31] = 0, and any bus read operation is pending during this snoop operation, 
the write-back of the modified sector is considered to be a high-priority bus operation 
that may be enveloped within the pending load operation. 

• If HID0[31] = 1 , and any bus read operation with HP_SNP_REQ asserted is pending 
during this snoop operation, the write-back of the modified sector is considered to be a 
high-priority bus operation that may be enveloped within the pending load operation. 

• If the addressed sector hits any of the three entries in the write queue, that entry is 
tagged as a high-priority push, after which it can be loaded from memory. 


Write with flush 
Write with flush atomic 


Write-with-flush and write-with-flush-atomic operations occur after the processor issues 
a store or stwcx. instruction, respectively. 

• If the addressed sector is in the shared or exclusive state, an additional snoop action 
is generated internally that forces the state of the addressed sector to invalid. 

• If the addressed sector is in the modified state, the ARTRY is asserted and an 
additional, internally generated snoop action is initiated that pushes the modified 
sector out of the cache and changes the state of the sector to invalid. 

• If HID0[31] = 0, and any bus read operation is pending during this snoop operation, 
the write-back of the modified sector is considered to be a high-priority bus operation 
that may be enveloped within the pending load operation. 

• If HID0[31] = 1 , and any bus read operation with HP_SNP_REQ asserted is pending 
during this snoop operation, the write-back of the modified sector is considered to be a 
high-priority bus operation that may be enveloped within the pending load operation. 

• If the addressed sector hits any of the three entries in the write queue, that entry is 
tagged as a high-priority push operation. 
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Table 4-5. Response to Bus Transactions (Continued) 



Transaction 


Response 


Kill block 


The kill-block operation is an address-only bus transaction initiated when one of the 
following occurs: 

• a dcbi instruction is executed 

• a dcbz operation to a block marked S or 1 is executed 

• a write operation to a block marked S occurs 

If a snoop hit occurs, an additional snoop is initiated internally and the sector is forced to 
the 1 state, effectively killing any modified data that may have been in the sector. The 
three-entry write queue is also snooped, and if a queue entry hits, it is purged. 


Write with kill 


In a write-with-kill operation, the processor eventually snoops the cache for a copy of 
the addressed sector. If one is found, an additional snoop action is initiated internally 
and the sector is forced to the 1 state, killing modified data that may have been in the 
sector. In addition to snooping the cache, the three-entry write queue is also snooped. A 
kill operation that hits an entry in the write queue purges that entry from the queue. 


Read 
Read atomic 


The read operation is used by most single- and burst reads on the bus. A read on the 
bus with the (atlL signal asserted causes the following responses: 

• If the addressed sector is in the cache but is invalid, the MPC601 takes no action. 

• If the sector is in the shared state, the MPC601 asserts the shared snoop status 
indicator. 

• If the sector is in the E state, the MPC601 asserts the shared snoop status indicator 
and initiates an additional snoop action to change the state of that sector from E to S. 

• If the sector is in the cache in the M state, the MPC601 asserts both the ARTRY and 
the i>Hb snoop status signals. It also initiates an additional snoop action to push the 
modified sector out of the chip and to mark that cache sector as shared. 

Read atomic operations appear on the bus in response to Iwarx instructions and 
generate the same snooping responses as read operations. 


Read with intent to modify 

(RWITM) 

RWITM atomic 


An RWITM operation is issued to acquire exclusive use of a memory location for the 
purpose of modifying it. 

• If the addressed sector is in the 1 state, the MPC601 takes no action. 

• If the addressed sector is in the cache and in the S or E state, the MPC601 initiates an 
additional snoop action to change the state of the cache sector to 1. 

• If the addressed sector is in the cache and in the M state, the MPC601 asserts both 
the ARTRY and the SHD snoop status signals. It also initiates an additional snoop 
action to push the modified sector out of the chip and to change the state of that sector 
in the cache from M to 1. 

The RWITM atomic operations appear on the bus in response to stwcx. instructions 
and are snooped like RWITM instructions. 


sync 


The sync instruction causes an address-only bus transaction. The MPC601 asserts the 
ARTRY snoop status if there are any TLB-related snoop operations pending in the chip. 
This transaction is also generated by the eieio instruction on the MPC601 . 


TLB invalidate 


A TLB invalidation operation is caused by executing a tibie instruction. This instruction 
transmits the MPC601 's TLB index (bits 1 2-1 9 of the EA) onto the system bus. Other 
processors on the bus invalidate TLB entries associated with EAs that match those bits. 


I/O reply 


The I/O reply operation is part of the I/O controller interface operation. It serves as the 
final bus operation in the series of bus operations that service an I/O controller interface 
operation. 
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4.7.10 Internal ARTRY Scenarios 

The following scenarios, along with others, cause the MPC601 to assert the ARTRY signal. 

• Snoop hits to a sector in the M state (optional on kill requests) 

• Snoop hits when a reload dump request is active 

• Snoop hits on a valid (that is, not cancelled) operation that is queued internally. 

• Snoop hits while a cast-out request is pending during this or the next clock cycle. 

4.7.11 Enveloped High-Prlority Cache Sector Push Operation 

If the MPC601 has a read operation outstanding on the bus and another pipelined bus 
operation hits against a modified sector, the MPC6()1 provides a high-priority push 
operation. This transaction is enveloped within the address and data tenures of a read 
operation. This feature prevents deadlocks in system organizations that support multiple 
memory-mapped buses. More specifically, the MPC601 internally detects the scenario 
where a load request is outstanding and the processor has pipelined a write operation on top 
of the load. Normally, when the data bus is granted to the MPC601, the resulting data bus 
tenure is used for the load operation. The enveloped high-priority cache sector push feature 
defines a bus signal, the data bus write only qualifier (DBWO), which, when asserted with 
a qualified data-bus grant, indicates that the resulting data tenure should be used for the 
store operation instead. This signal is described in Section 9.10, "Using DBWO (Data Bus 
Write Only." DBWO asserted at any other time is considered a no-op to the MPC601 with 
respect to the ordering of the data bus tenures of pipelined bus operations. Note that the 
enveloped copy-back operation is an internally pipelined bus operation. 

4.8 Cache Control Instructions 

Software must use the appropriate cache management instructions to ensure that caches are 
kept consistent when data is modified by the processor or by input data transfer. When a 
processor alters a memory location that may be contained in an instruction cache, software 
must ensure that updates to memory are visible to the instruction fetching mechanism. 
Although the instructions to enforce coherency vary among implementations and hence 
many operating systems will provide a system service for this function, the following 
sequence is typical: 

1 . dcbst (update memory) 

2. sync (wait for update) 

3. icbi (invalidate copy in cache) 

4. isyr.c (invalidate copy in ov/n instruction buffer) 

These operations are necessary because the memory may be in write-back mode. Since 
instruction fetching may bypass the data cache, changes made to items in the data cache 
may not be reflected in memory until after the instruction fetch completes. 
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The PowerPC architecture defines instructions for controlling both the instruction and data 
caches. Instruction cache control instructions are valid instructions on the MPC601 , but 
may function differently than they do when used on PowerPC processors that have separate 
instruction and data caches. 

Data caches and unified caches must be kept consistent with other data caches, combined 
caches, memory, and I/O data transfers. However, to ensure consistency, aliased effective 
addresses (two effective addresses that map to the same physical address) must have the 
same page attributes (WIM bits). 

Note that in the PowerPC architecture, the term cache block, or simply block when used in 
the context of cache implementations, refers to the unit of memory at which coherency is 
maintained. For the MPC6()1 this is the eight-word sector. This value may be different for 
other PowerPC implementations. In-depth descriptions of coding these instructions is 
provided in Chapter 3, "Addressing Modes and Instruction Set Summary," and Chapter 10, 
"Instruction Set." 

4.8.1 Cache Line Compute Size Instruction (cics) 

The clcs instruction places the cache information specified in the instruction into a target 
register. This instruction is used by the POWER architecture to determine the maximum 
and minimum line sizes for cache implementations. For a complete description of this 
instruction, refer to Chapter 10, "Instruction Set." 

4.8.2 Data Caciie BJocic Touch instruction (debt) 

This instruction provides a method for improving performance through the use of software- 
initiated prefetch hints. The MPC601 performs the fetch for the cases when the address hits 
in the UTLB or the BTLB, and when it is permitted load access from the addressed page. 
The operation is treated similarly to a byte load operation with respect to coherency. 

If the address translation does not hit in the UTLB or BTLB, or if it does not have load 
access permission, the instruction is treated as a no-op. 

If the access is directed to a cache-inhibited page, or to an I/O controller interface segment, 
then the bus operation occurs, but the cache is not updated. 

This instruction never affects the reference or change bits in the hashed page table. 

While the MC98601 maintains a cache line size of 64 bytes, the debt instruction may only 
result in the prefetch of a 32-byte sector (the one directly addressed by the EA). The other 
32-byte sector in the cache line may or may not be fetched, depending on activity in the 
dynamic memory queue. 

A successful debt instruction will affect the state of the UTLB and cache LRU bits as 
defined by the LRU algorithm. 
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Note that other PowerPC implementations may not take any action based on the execution 
of this instruction, but they may prefetch the cache block corresponding to the EA into their 
cache, 

4.8.3 Data Cache Block Touch for Store Instruction (dcbtst) 

The dcbtst instruction behaves exactly like the debt instruction as implemented on the 
MPC601. 

4.8.4 Data Cache Block Set to Zero Instruction (dcbz) 

If the block (the cache sector consisting of 32 bytes) containing the byte addressed by the 
EA is in the data cache, all bytes are cleared to 0, 

If the block containing the byte addressed by the EA is not in the data cache and the 
corresponding page is caching-allowed, the block is established in the data cache without 
fetching the block from main memory, and all bytes of the block are cleared to 0. 

If the page containing the byte addressed by the EA is caching-inhibited or write-through, 
then the system alignment exception handler is invoked. 

If the block containing the byte addressed by the EA is in coherence required mode, and 
the block exists in the data cache(s) of any other processor(s), it is kept coherent in those 
caches. 

The dcbz instruction is treated as a store to the addressed byte with respect to address 
translation and protection. 

If the EA corresponds to an I/O controller interface segment (SR[T]=1), the dcbz 
instruction is treated as a no-op. 

See Chapter 5, "Exceptions," for more information about a possible delayed machine check 
exception interrupt that can occur by use of dcbz if the operating system has set up an 
incorrect memory mapping. 

4.8.5 Data Cache Block Store Instruction (dcbst) 

If the block (the cache sector consisting of 32 bytes) containing the byte addressed by the 
EA is in coherence required mode, and a block containing the byte addressed by the EA is 
in the data cache of any processor and has been modified, the writing of it to main memory 
is initiated. 

The function of this instruction is independent of the wrile-through and cache- 
inhibited/allowed modes of the block containing the byte addressed by the EA. 

This instruction is treated as a load from the addressed byte with respect to address 
translation and protection. 
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If the EA specifies a storage address for an I/O controller interface segment (segment 
register T-bit=l), the dcbst instruction is treated as a no-op. 

4.8.6 Data Cache Block Flush Instruction (dcbf) 

The action taken depends on the memory mode associated with the target, and on the state 
of the sector. The list below describes the action taken for the various cases. The actions 
described must be executed regardless of whether the page containing the addressed byte 
is in caching-inhibited or caching-allowed mode. 

• Coherence-required mode 

Unmodified sector — Invalidates copies of the sector in the caches of all processors. 

Modified sector — Copies the sector to memory. Invalidates copies of the sector in 
the caches of all processors. 

Absent sector — If modified copies of the sector are in the caches of other processors, 
causes them to be copied to memory and invalidated. If unmodified copies are in the 
caches of other processors, cause those copies to be invalidated. 

• Coherence-not-required mode 

Unmodified sector — Invalidates the sector in the processor's cache. 

Modified sector — Copies the sector to memory. Invalidate the sector in the 
processor's cache. 

Absent sector — Does nothing. 

The MPC601 treats this instruction as a load from the addressed byte with respect to 
address translation and protection. 

4.8.7 Enforce In-Order Execution of I/O Instruction (eieio) 

The eieio instruction provides an ordering function for the effects of load and store 
instructions executed by a given processor. Executing eieio ensures that all memory 
accesses previously initiated by the given processor are completed with respect to main 
memory before any memory accesses subsequently initiated by the processor access main 
memory. 

The eieio instruction orders loads and stores to caching-inhibited memory only. 

The eieio instruction is intended for use only in doing memory-mapped I/O. It can be 
thought of as placing a barrier into the stream of memory accesses issued by a processor, 
such that any given memory access appears to be on the same side of the barrier to both the 
processor and the I/O device. 

The eieio instruction may complete before previously initiated memory accesses have been 
performed with respect to other processors and mechanisms. 
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Unlike the sync instruction, eieio need not serialize the processor. It requires only that the 
processor execute memory accesses in the order described above, and enforce that order in 
any queues in the memory subsystem. 

4.8.8 Instruction Cache Block Invalidate Instruction (icbl) 

The icbi instruction is provided in the PowerPC architecture for use in processors with 
separate instruction and data caches. The effective address is computed, translated, and 
checked for protection violations as defined in the PowerPC architecture; however, the 
instruction functions as a no-op on the MPC601 . 

The Data Cache Block Invalidate (dcbi) instruction may be used to invalidate instructions 
from the cache in the MPC601. Refer also to the following section that describes the 
requirements for self-modifying code. 

In other PowerPC processors, the icbi instruction executes as follows: 

• If the block (sector) containing the byte addressed by EA is in coherency-required 
mode and a sector containing the byte addressed by EAis in the instruction cache of 
any processor, the sector is made invalid in all such processors, so that subsequent 
references cause the sector to be refetched. 

• If coherency is not required for the sector containing the byte addressed by EA and 
a sector containing the byte addressed by EA is in the instruction cache of this 
processor, the sector is made invalid in this processor so that subsequent references 
cause the sector to be fetched from main memory (or from a cache). 

4.8.9 Instruction Synchronize Instruction (isync) 

The isync instruction waits for all previous instructions to complete and then discards any 
prefetched instructions, causing subsequent instructions to be fetched (or refetched) from 
memory and to execute in the context established by the previous instructions. This 
instruction has no effect on other processors or on their caches. 

4.9 Bus Operations Caused by Cache Control 
Instructions 

Table 4-6 provides an overview of the bus operations initiated by cache control 
instructions. Note that Table 4-6 assumes that the WIM bits are set to {X)l; that is, since the 
cache is operating in write-back mode, caching is permitted and coherency is enforced. 
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Table 4-6.Bus Operations Caused by Cache Control Instructions (WIM = 001) 



Operation 


Caehe State 


Next Cache State 


Bus Operations 


Comment 


sync/eieio 


Don't care 


No change 


sync 


First clears memory queue 


dcbi 


Don't care 


1 


Kill 


— 


debt 


l,S, E 


1 


Flush 


— 


dcbf 


M 


1 


Write with kill 


Sector is pushed 


dcbst 


1, S, E 


No change 


Clean 


— 


dcbst 


M 


E 


Write with M 


Sector is pushed 


dcbz 


1 


M 


Kill 


May also cast out a sector 


dcbz 


S 


M . 


Kill 


— 


dcbz 


E, M 


M 


None 


Writes over modified data 


debt 


1 


No change 


Read 


State change on reload 
may cast out sector 


debt 


S, E, M 


No change 


None 


— 



Table 4-6 does not include noncacheable or write-through cases, nor does it completely 
describe the mechanisms for the operations described. These conditions are described in 
Section 4.11, "MESI State Transactions." 

The cache control instructions are described in detail in Chapter 3, "Addressing Modes and 
Instruction Set Summary," and Chapter 10. Several of the cache control instructions 
broadcast onto the MPC601 interface so that all processors in a multiprocessor system can 
take appropriate actions. The MPC601 contains snooping logic to monitor the bus for these 
commands and the control logic required to keep the cache and the memory queues 
coherent. Additional details on the specific bus operations performed by the MPC601can 
be found the Chapter 9, "System Interface Operation." 

4.10 Memory Unit 

The MPC601 's memory unit contains read and write queues that buffer operations between 
the external interface and the cache. These operations are comprised of operations resulting 
from load and store instructions that are cache misses, read and write operations required 
to maintain cache coherency, and table search operations. As shown in Figure 4-5, the read 
queue contains two elements and the write queue contains three elements. Each element of 
the write queue can contain as many as eight words (one sector) of data. One element of the 
write queue, marked snoop in Figure 4-5, is dedicated to writing cache sectors to system 
memory after a modified sector is hit by a snoop from another processor or snooping device 
on the system bus. The use of this queue guarantees a high-priority operation receives a 
deterministic response time when snooping hits a modified sector. 
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Figure 4-5. Memory Unit 

The other two elements in the write queue are used for store operations and writing back 
modified sectors that have been deallocated by updating the queue; that is, when a cache 
sector is full, the least-recently used cache sector is deallocated by first being copied intp 
the write queue and from there to system memory if it is modified. Note that snooping can 
occur after a sector has been pushed out into the write queue and before the data has been 
written to system memory. Therefore, to maintain a coherent memory, the write queue 
elements are compared to snooped addresses in the same way as the cache tags. If a snoop 
hits a write queue element, the data is first stored in system memory before it can be loaded 
into the cache of the snooping bus master. Full coherency checking between the cache and 
the write queue prevents dependency conflicts. 

The retry signals and bus operations pertaining to snooping are described in Chapter 9, 
"System Interface Operation." 

Execution of a load or store instruction is considered complete when the associated address 
translation completes, guaranteeing that the instruction has completed to the point where it 
is known that it will not generate an internal exception. However, after address translation 
is complete, a read or write operation can generate an external exception. 

Load and store instructions are always issued and translated in program order with respect 
to other load and store instructions. However, a load or store operation that hits in the cache 
can complete ahead of those that miss in the cache. The MPC601 ensures memory 
consistency by comparing target addresses and prohibiting instructions from completing 
out of order if an address matches. Load and store operations can be forced to execute in 
strict program order. 
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4.10.1 Memory Unit Queuing Structure 

The memory queue receives requests from the cache unit for arbitration onto the MPC6()1 
bus interface. These requests may either be presented immediately to the bus interface logic 
or they may be queued for future arbitration onto the bus. The memory queue consists of a 
two-element load queue and a three-element write queue. Each write queue element can 
hold a sector of data (32 bytes) associated with a single address. 

Some operations presented to the memory queue cannot be queued. These operations 
typically require synchronization with respect to either the execution units, the cache, or 
the memory queue itself. In general, when these requests are presented and not arbitrated 
directly onto the bus, they stall above the cache (but do not necessarily prevent use of the 
cache) and attempt to re-arbitrate on the next cycle. These operations include the following: 

Cache control instructions that are broadcast 

Execution of the tlbie instruction 

Execution of the sync instruction 

Execution of the eieio instruction 

Accesses to I/O controller interface segments 

Cache requests for exclusive ownership when the sector is resident but not exclusive 

in the cache 

The memory queues also allows the optional loading of the sector adjacent to the one 
containing the critical data. As the memory read queue receives and processes cache sector 
reload requests, it is advantageous to fetch the other sector if it is not already in the cache 
unless fetching the other sector delays access to data required for the machine to continue 
processing. The memory unit logic detects whether other operations are pending; if not, it 
initiates a fetch for the other sector. Note that this function can be disabled by setting bit 26 
in HIDO (for instruction fetch misses) and bit 27 in HIDO (for load/store misses). 

4.10.2 IVIemory Unit Queuing Priorities 

This section describes the priorities for access to the system interface: 

1 . High-priority cache push-out operations 

2. Normal snoop push-out operations 

3. I/O controller interface segment accesses that incur no additional delays (that is, 
they have not been retried because of latency). 

4. Cache instruction operations 

5. Read requests, such as loads, RWITMs, and instruction fetches 

6. Single-beat write operations 

7. sync instructions 

8. Optional cache-line fill operations 
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9. Cache sector cast-out operations 

10. I/O controller interface segment accesses that incur additional delays (that is, they 
have been retried because of latency) 

4.10.3 Bus Interface 

The bus interface logic sequences operations onto the MPC601 bus according to defined 
protocols. The bus interface logic is also responsible for snooping other bus traffic, 
presenting these operations to the rest of the device for coherency considerations and 
reporting the appropriate snoop status onto the bus. 

For additional information about the MPC601 bus interface and the bus protocols, refer to 
Chapter 9, "System Interface Operation." 

4.11 MESI State Transactions 

Table 4-7 shows MESI state transitions for various operations. 

Table 4-7. MESI State Transitions 



Operation 


Cache 
Operation 


Bus 
sync 


WIM 


Current 
State 


Next 
State 


Cache Actions 


Bus 
Operation 


Load or Fetch 


Read 


No 


xOx 


1 


Same 


1 Cast out of modified 
sector 1 (as required) 


Write with kill 


2 Pass four-beat read 
to memory queue 


Read 


3 Secondary cast out 
of sector 2 (as 
required) 


Write with kill 


Load or Fetch 

0- = o) 


Read 


No 


xOx 


S.E.M 


Same 


Read data from cache 


— 


Load or Fetch 
T=0 or Load 

fT=1. 
BUID=x7F') 


Read 


No 


x1x 


1 


Same 


Pass single-beat read 
to memory queue 


Read 


Load or Fetch 
T=0 or Load 

CT=1. 
BUID=x7F') 


Read 


No 


x1x 


S,E 


1 


CRTRYread 




Load or Fetch 
T=0 or Load 
(T=1. 
BUID=x7F') 


Read 


No 


xlx 


M 


1 


CRTRY read (push 
sector to write queue) 


Write with kill 


Load 

CT-=1, 
BUID*x7F') 


I/O 

controller 

load 




x1x 


— 


— 


— 


I/O load 


larx 


Read 


Acts like other reads but bus operation uses special encoding 



MOTOROLA 



Chapter 4. Cache and Memory Unit Operation 



4-25 



Table 4-7. MESI State Transitions 



Operation 


Cache 
Operation 


Bus 
sync 


WIM 


Current 
State 


Next 
State 


Cache Actions 


Bus 
Operation 


Store 
(7=0) 


Write 


No 


OOx 


1 


Same 


1 Cast out of modified 
sector 


Write with kill 


2 PassRWITMto 
memory queue 


n(vitm 


3 Secondary cast out 
of sector 2 


Write with kill 


store 

Cr=o) 


Write 


yes 


OOx 


S 


Same 


1 ORTRY write 


— 


2 Pass kill 


Kill 


IV! 


3 Write data to cache 


— 


Store 
(7=0) 


Write 


No 


OOx 


E,M 


M 


Write data to cache 


— 


Store vt stcx 
(T=0) 


Write 


No 


10x 


1 


Same 


Pass single-beat write 
to memory queue 


Write with 
flush 


Store 9i stcx 
(T=0) 


Write 


No 


10x 


S.E 


Same 


1 Write data to cache 


— 


2 Pass single-beat 
write to memory 
queue 


Write with 
flush 


Store / stcx 

fr=o) 


Write 


No 


10x 


M 


E 


1 ORTRY write 


— 


2 Push sector to write 
queue 


Write with kill 


Store (T=0) 
or stcx 
(WIM=10x) 
or store (T=1 , 
BUID=x7F) 


Write 


No 


x1x 


1 


Same 


Pass single-beat write 
to memory queue 


Write with 
flush 


Store (T=0) 
or stcx 
(WIM=10x) 
or store (T=1 , 
BUID=x7P) 


Write 


No 


x1x 


S,E 


1 


ORTRY write 




Store (T=0) 
or stcx 
(WIM=10x) 
or store (T=1 , 
BUID=x7F') 


Write 


No 


x1x 


M 


1 


1 ORTRY write 


— 


2 Push sector to write 
queue 


Write with kill 


Store 0"=1. 
BUID^x7F') 


I/O 
controller 


No 


— 


— 


— 


— 


I/O store 
request 


stcx 


Conciitional 
write 


If the reserved bit is set, thiis operation is like other writes except the bus operation 
uses a special encoding. 
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Table 4-7. MESI State Transitions 



Operation 


Cache 
Operation 


Bus 
sync 


WIM 


Current 
State 


Next 
State 


Cache Actions 


Bus 
Operation 


tibi 


TLB 
invalidate 


Yes 


XXX 


X 


X 


1 CRTRYTLBI 


TLB invalidate 


2 Pass TLB 1 


— 


3 No action 


— 


sync/eieio 


Synchroniz 
ation 


Yes 


XXX 


X 


X 


1 CRTRYsync 


dsync 


2 Pass sync 


— 


3 No action 




debt 


Data cache 
block flush 


Yes 


XXX 


I.S.E 


Same 


1 CRTRYdcbf 


— 


2 Pass flush 


Flush 


Same 


1 


3 State change only 


— 


dcbf 


Data cache 
block flush 


No 


XXX 


M 


1 


Push sector to write 
queue 


Write with kill 


dcbst 


Data cache 
block store 


Yes 


XXX 


i.S.E 


Same 


1 CRTRY dcbst 


— 


2 Pass clean 


Clean 


Same 


Same 


3 No action 


— 


dcbst 


Data cache 
block store 


No 


xxm 


M 


E 


Push sector to write 
queue 


Write with kill 


dcbz 


Data cache 
block set to 
zero 


No 


xlx 


X 


X 


Alignment trap 





dcbz 


Data cache 
block set to 
zero 


No 


10x 


X 


X 


Alignment trap 





dcbz 


Data cache 
block set to 
zero 


Yes 


OOx 


1 


Same 


1 CRTRY dcbz 


— 


2 Cast out of modified 
sector 


Write with kill 


3 Pass kill 


Kill 


4 Secondary cast out 
of sector 2 


Write with kill 


Same 


M 


5 Clear sector 


— 


dcbz 


Data cache 
block set to 
zero 


Yes 


OOx 


S 


Same 


1 CRTRY deb? 


— 


2 Pass kill 


Kill 


Same 


1^ 


3 Clear sector 


— 


dcbz 


Data cache 
block set to 
zero 


No 


OOx 


E,M 


M 


Clear sector 


— 
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Table 4-7. MESI State Transitions 



Operation 


Caehe 
Operation 


Bus 
sync 


WilVI 


Current 
State 


Next 
State 


Cache Actions 


Bus 
Operation 


debt 


Data cache 
block touch 


No 


xlx 


1 


Same 


Pass single-beat read 
to memory queue 


Read 


debt 


Data cache 
block touch 


No 


xlx 


S,E 


1 


CRTRYread 


— 


debt 


Data cache 
block touch 


No 


x1x 


1^ 


1 


1 CRTRYread 


— 


2 Push sector to write 
queue 


Write with kill 


debt 


Data cache 
block touch 


No 


xOx 


1 


Same 


1 Cast out of modified 
sector (as required) 


Write with kill 


2 Pass four-beat read 
to memory queue 


Read 


3 Secondary cast out 
of sector (as 
required) 


Write with kill 


debt 


Data cache 
block touch 


No 


xOx 


S,E,M 


Same 


No action 


— 


Secondary 
cast out 


Secondary 
cast out 


No 


xxx 


X 


Same 


Cast out 


Write with kill 


Single-beat 
read 


Reload 
dump 1 


No 


XXX 


X 


Same 


Forward datajn 


— 


Four-beat 
read (quad- 
word 1) 


Reload 
dump 1 


No 


xxx 


X 


Same 


1 Forward datajn 


— 


2 Write datajn to 
cache 


— 


Four-beat 
read (quad- 
word 2)— S 


Reload 
dump 2 


No 


XXX 


X 


Same 


Write datajn to cache 


— 


Four-beat 
read (quad- 
word 2)— E 


Reload 
dump 2 


No 


xxx 


X 


Same 


Write datajn to cache 


— 


Four-beat 
write (quad- 
word 1 ) 


Reload 
dump1 


No 


xxx 


X 


Same 


1 Splice and forward 
datajn 


— 


2 Write datajn to 
cache 


— 


Four-beat 
write (quad- 
word 2) 


Reload 
dump 2 


No 


xxx 


X 


Same 


Write datajn to cache 


— 


Optional 
reload of 
adjacent 
sector (quad- 
word 1) 


Reload 
dumpi 


No 


xxx 


X 


Same 


Write datajn to cache 
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Table 4-7. MESI State Transitions 



Operation 


Cache 
Operation 


Bus 
sync 


WIM 


Current 
State 


Next 
State 


Cache Actions 


Bus 
Operation 


Optional 
reload of 
adjacent 
sector (quad- 
word 2)— S 


Reload 
dump 2 


No 


XXX 


1 


S 


Write datajn to cache 




Optional 
reload of 
adjacent 
sector (quad- 
word 2)— E 


Reload 
dump 2 


No 


XXX 


1 


E 


Write datajn to cache 




E->S 


Snoop 


No 


XXX 


E 


S 


State change only 
(committed) 


— 


S->l 


Snoop 


No 


XXX 


S 


1 


State change only 
(committed) 


— 


E-^l 


Snoop 


No 


XXX 


E 


1 


State change only 
(committed) 


— 


M-^l 


Snoop 


No 


XXX 


M 


1 


State change only 
(committed) 


— 


Push 


Snoop 


No 


XXX 


M 


s 


Conditionally push 


Write with kill 


Push 
M^l 


Snoop 


No 


XXX 


M 


1 


Conditionally push 


Write with kill 


Push 
M-^E 


Snoop 


No 


XXX 


M 


E 


Conditionally push 


Write with kill 



Note that debt is presented to the cache as a load operation. 
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Chapter 5 
Exceptions 



The PowerPC exception mechanism allows the processor to change to supervisor state as 
a result of external signals, errors, or unusual conditions arising in the execution of 
instructions. When exceptions occur, information, such as the instruction that should be 
executed after control is returned to the original program and the contents of the machine 
state register, is saved to the save/restore registers (SRRO and SRRl), program control 
passes from user to supervisor level, and software continues execution at an address 
(exception vector) predetermined for each exception. 

Although multiple exception conditions can map to a single exception vector, the specific 
condition can be determined by examining a register associated with the exception — for 
example, the DAE/source instruction service register (DSISR) and the floating-point status 
and control register (FPSCR). Additionally, specific exception conditions can be explicitly 
enabled or disabled by software. 

Except for the catastrophic asynchronous exceptions (machine check and system reset) the 
MPC601 exception model is precise, defined as follows: 

• The exception handler is given the address of the excepting instruction (or the next 
instruction to execute in the case of asynchronous, precise exceptions) 

• All instructions prior in the instruction stream to the excepting instruction have 
completed execution and have written back their results. 

• No instructions subsequent to the excepting instruction in the instruction stream 
have been issued. 

Although the PowerPC architecture supports out-of-order instruction dispatch, exceptions 
are handled in program order; therefore, while exception conditions may be recognized out 
of order, they are handled strictly in order. When an instruction-caused exception is 
recognized, any unexecuted instructions that appear earlier in the instruction stream, 
including any that have not yet entered execute state, are allowed to complete. Any 
exceptions, caused by those instructions are handled in order. Likewise, exceptions that are 
asynchronous and precise are recognized when they occur, but are not handled until all 
instructions currently in execute stage successfully complete execution and report their 
results. 
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Unless a catastrophic condition causes a system reset or machine check exception, only one 
exception is handled at a time. If, for example, a single instruction encounters multiple 
exception conditions, those conditions are encountered sequentially. After the exception 
handler handles an exception, the instruction execution continues until the next exception 
condition is encountered. This method of recognizing and handling exception conditions 
sequentially guarantees that exceptions are recoverable. 

Exception handlers should save the information saved in SRRO and SRRl early to prevent 
the program state from being lost due to a system reset and machine check exception or to 
an instruction-caused exception in the exception handler. 

This chapter describes the MPC601 exception model, it explains each class of instruction, 
and it describes how the program state is saved for individual exceptions. 



5.1 Exception Classes 

All MPC601 exceptions can be described as either precise or imprecise and either 
synchronous or asynchronous. Asynchronous exceptions are caused by events external to 
the processor's execution; synchronous exceptions, which are all handled precisely by the 
MPC601, are caused by instructions. 

The MPC601 exceptions are shown in Table 5-1. 

Table 5-1. MPC601 Exception Classifications 



Synchronous/ Asynchronous 


Precise/Imprecise 


Exception Type 


Asynchronous 


Imprecise 


Machine Check 
System Reset 


Asynchronous 


Precise 


External interrupt 
Decrementer 


Synchronous 


Precise 


Instruction-caused exceptions 



Although exceptions have other characteristics as well, such as whether they are maskable 
or nonmaskable, the distinctions shown in Table 5-1 define categories of exceptions that the 
MPC601 handles uniquely. Note that Table 5-1 includes no synchronous imprecise 
instructions. While the PowerPC architecture supports imprecise floating-point exceptions, 
they do not occur in the MPC601 . 

Exceptions, and conditions that cause them, are listed in Table 5-2. 
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Table 5-2. Exceptions and Conditions 



Exception 
Type 


Vector Offset 
(hex) 


Causing Conditions 


Reserved 


00000 


— 


System reset 


00100 




A system reset is caused by the assertion of either SRESETor HrESET. 


Machine check 


00200 


A machine check is caused by the assertion of the 1 bA signal. 


Data access 


00300 


The cause of a data access exception can be determined by the bit settings in 
the DSISR, listed as follows: 

I Set if the translation of an attempted access is not found in the primary 
hash table entry group (HTEG), or in the rehashed secondary HTEG, or in 
the range of a BAT register; otherwise cleared. 

4 Set if a memory access is not permitted by the page or BAT protection 
mechanism described in Chapter 6; otherwise cleared. 

5 Set if the access was to an I/O segment (SR[T] =1 ) by an Iwarx, stwcx., or 
Iscbx instruction; otherwise cleared. 

6 Set for a store operation and cleared for a load operation. 

9 Set if an EA matches the address in the DABR while in one of the three 
compare modes. 

II Set if eciwx or ecowx is used and EAR[E] is cleared. 


Instruction 
access 


00400 


An instruction access exception is caused when an instmction fetch cannot be 
performed for any of the following reasons: 

• The effective address cannot be translated. That is, there is a page fault for 
this portion of the translation, so an instruction access exception must be 
taken to retrieve the translation from a storage device such as a hard disk 
drive. 

• The fetch access is to an I/O segment. 

• The fetch access violates memory protection. If the K bits in the segment 
register and the PP bits in the PTE are set to prohibit read access, 
instructions cannot be fetched from this location. 


External 
interrupt 


00500 


An external interrupt occurs when the TRT signal is asserted. 


Alignment 


00600 


An alignment exception is caused when the MPC601 cannot perform a memory 
access for one of the following reasons: 

• The operand of a floating-point load or store operation is in an I/O segment 

(SR[T]=1). 

• An Iscbx instruction crosses a page boundary. 

• The operand of a load or store (including string loads and stores) crosses a 
protection boundary. 

• The operand of a Imw or stmw instruction crosses a segment or BAT 
boundary. 

• The operand of a Data Cache Block Set to Zero (dcbz) instruction is in a 
page specified as write-through or cache-inhibited for a page-address 
translation access. 

• In little-endian mode, any operand that is not properly aligned 

• In littlG-cndian mcde, any attempted execution of the string/multiple 
instructions 
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Table 5-2. Exceptions and Conditions (Continued) 



Exception 
Type 


Vector Offset 
(hex) 


Causing Conditions 


Program 


00700 


A program exception is caused by one of the following exception conditions, 
which correspond to bit settings in SRR1 and arise during execution of an 
instruction: 

• Floating-point enabled exception — A floating-point enabled exception 
condition is generated when the following condition is met: 

([^SR[FEO] 1 MSR[FE1]) & FPSCR[FEX] is 1. 
FPSCR[FEX] is set by the execution of a floating-point instruction that 
causes an enabled exception or by the execution of a "move to FPSCR" 
instruction that results in both an exception condition bit and its 
corresponding enable bit being set in the FPSCR. 

• Illegal instruction— An illegal instruction program exception is generated 
when execution of an instruction is attempted with an illegal opcode or illegal 
combination of opcode and extended opcode fields, or when execution of an 
optional instruction not provided in the MPC601 is attempted (these do not 
include those optional instructions that are treated as no-ops). 

• Privileged instruction— A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the 
IVISR register user privilege bit, MSR[PR], is set. In the MPC601 , this 
exception is generated for mtspr or mfspr with an invalid SPR field if 
SPR[0]=1 and MSR[PR]=1 . This may not be true for all PowerPC 
processors. 

• Trap — A trap type program exception is generated when any of the 
conditions specified in a trap instruction is met. 

• Illegal operations — ^The MPC601 takes illegal operation program exceptions 
for unimplemented PowerPC instructions. The PowerPC instruction set is 
described in Chapter 3. 


Floating-point 
unavailable 


00800 


A floating-point unavailable exception is caused by an attempt to execute a 
floating-point instruction (including floating-point load, store, and move 
instructions) when the floating-point available bit is disabled, MSR[FP]=0. 


Decrementer 


00900 


The decrementer exception occurs when the most significant bit of the 
decrementer (DEC) register transitions from to 1 . 


I/O controller 
interface error 


OOAOO 


An I/O controller interface error exception is taken only when an operation to an 
I/O segment fails (such a failure is indicated to the MPC601 by a particular bus 
reply packet). If an I/O controller interface exception is taken on a memory 
access directed to an I/O segment, the SRRO contains the address of the 
instruction following the offending instruction. Note that this exception may not 
be implemented in other PowerPC processors. 


Reserved 


OOBOO 


— 


System call 


OOCOO 


A system call exception occurs when a System Call (so) instruction is executed. 


Reserved 


OOEOO 


Other PowerPC processors may use this vector for floating-point assist 
exceptions. 


Reserved 


00E10-00FFF 


— 


Reserved 


01 000-01 FFF 


Reserved, implementation-specific 
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Table 5-2. Exceptions and Conditions (Continued) 



Exception 
Type 


Vector Offset 
(hex) 


Causing Conditions 


Run mode 
exception 


02000 


The run mode exception is tal<en depending on the settings of the H1D1 register 

and the MSR[SE] bit. 

The foilowing modes correspond with bit settings in the HID1 register: 

• Normal run mode — no address breal< points are specified, and the [V1PC601 
executes from to zero to three instructions per cycle 

• Single instruction step mode — One Instruction is processed at a time. The 
appropriate break action is taken after an instruction is executed and the 
processor quiesces. 

• Limited Instruction address compare— The MPC601 runs at full speed (in 
parallel) until the EA of the instruction being decoded matches the EA 
contained in mD2. Addresses for branch instructions and floating-point 
instructions may never be detected. 

The following mode is taken when the MSR[SE] bit is set. 

• MSR[SE] trace mode— Note that in other PowerPC implementations, the 
trace exception is a separate exception with its own vector x'OODOO'. 


Reserved 


02001 -03FFF 


— 



5.1 .1 Precise Exceptions 

In the MPC601, all synchronous exceptions and the asynchronous external interrupt and 
decrementer exceptions are handled precisely; that is, all instructions that occur in the 
instruction stream before the excepting event appear to complete and subsequent 
instructions execute after the exception has been handled. When one of the MPC60rs 
precise exceptions occurs, SRRO is set to point to an instruction such that all prior 
instructions in the instruction stream have completed execution and no subsequent 
instruction has begun execution. However, depending on the exception type, the instruction 
addressed by SRRO may not have completed execution. 

When an exception occurs, instruction dispatch (the issuance of instructions by the 
instruction fetch mechanism to any instruction execution mechanism) is halted and the 
following synchronization is performed: 

1. The exception mechanism waits for all previous instructions in the instruction 
stream to complete to a point where they report all exceptions they will cause. 

2. The processor ensures that all previous instructions in the instruction stream 
complete in the context in which they began execution. 

3. The exception mechanism is responsible for saving and restoring the processor state. 
After control passes back to the user level, there are no instructions in execute stage, 
and the user program instructions are dispatched and executed in this new context. 

The synchronization described above is sometimes referred to as context synchronization. 
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5.1.1.1 Synchronous/Precise Exceptions 

In the MPC601, all exceptions caused by instructions are precise. When instruction 
execution causes a precise exception, the following conditions exist at the exception point: 

• Depending on the type of exception, SRRO addresses either the instruction causing 
the exception or the immediately following instruction. The instruction addressed 
can be determined from the exception type and status bits, which are described with 
the description of each exception. 

• All instructions that precede the excepting instruction are allowed to complete 
before the exception is processed. However, some memory accesses generated by 
these preceding instructions may not have been performed with respect to all other 
processors or system devices. 

• The instruction causing the exception may not have begun execution, may have 
partially completed, or may have completed, depending on the exception type. 

• No subsequent instructions in the instruction streain complete execution. 

Note that other PowerPC microprocessors may support optional imprecise floating-point 
exception modes. While parallel processing allows the possibility of two instructions 
reporting exceptions during the same cycle, they are handled in prograin order. If a single 
instruction generates multiple exception conditions, those exceptions are handled 
sequentially, as described in Section 5.1.3, "Sequential Exception Processing." Exception 
priorities are described in Section 5.1.2, "Exception Priorities." 

5.1 .1 .2 Asynchronous/Precise Exceptions 

The MPC601 supports two asynchronous, precise exceptions — external interrupt and 
decrementer exceptions. For asynchronous exceptions, the following conditions exist at the 
exception point: 

• All instructions issued before the event that caused the exception, and any 
undispatched instructions that precede those instructions in the instruction stream, 
appear to have completed before the exception is processed. However, some 
memory accesses generated by these preceding instructions may not have been 
performed with respect to all other processors or system devices. 

• SRRO addresses the next instruction that would have been executed next had the 
exception not occurred. 

• Architecturally, no subsequent instructions in the instruction streain complete 
execution. 

These two exceptions are maskable. When the machine state register external interrupt 
enable bits are cleared (MSR[EE]=0), these exception conditions are latched and are not 
recognized until the EE bit is set. MSR[EE] is cleared automatically when an exception is 
taken to delay recognition of conditions causing asynchronous, precise exceptions. No two 
precise exceptions can be recognized simultaneously. Handling of an asynchronous, 
precise exception does not begin until all currently executing instructions complete and any 
synchronous, precise exceptions caused by those instructions have been handled, as 
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described in Section 5.1.3, "Sequential Exception Processing." Exception priorities are 
described in Section 5.1.2, "Exception Priorities." 

5.1.1.3 Asynchronous, Imprecise Exceptions 

There are two asynchronous, imprecise exceptions — system reset and machine check. 
These two exceptions have the highest priority and can occur while other exceptions are 
being processed. Note that asynchronous, imprecise exceptions are never delayed; 
therefore, if two of these exceptions occur in immediate succession, the state information 
saved by the first exception may be overwritten when the subsequent exception occurs. 

These exceptions cannot by masked by using the MSR[EE] bit. A machine check exception 
can only occur if the machine check enable bit, MSR[ME], is set. If MSR[ME] is cleared, 
the processor goes directly into checkstop state. When an imprecise exception occurs, the 
following conditions exist at the exception point: 

• The integer instruction pipeline acts as the point of reference for all instructions in 
the pipeline. When an asynchronous, imprecise exception occurs, floating-point 
instructions that have begun execution out-of-order ahead of integer instructions 
that have yet to be decoded do not complete execution. 

• SRRO addresses either the instruction causing the exception or some instruction 
following the instruction causing the exception. 

• An exception is generated such that all instructions preceding the instruction 
addressed by SRRO appear to have completed with respect to the executing 
processor. 

Neither the instruction addressed by SRRO nor any subsequent instructions have begun 
execution. 

5.1.2 Exception Priorities 

This section describes how exceptions are prioritized. Exceptions are roughly prioritized 
by exception class, as follows: 

1. Asynchronous, imprecise exceptions have priority over all other exceptions. These 
exceptions are taken forcibly, and do not wait for the completion of any precise 
exception handling. 

2. Synchronous, precise exceptions are caused by instructions and are handled in strict 
program order. 

3. Asynchronous, precise exceptions (external interrupt and decrementer exceptions) 
are delayed until higher priority exceptions are handled. 

The exceptions are listed in Table 5-3 in order of highest to lowest priority. 
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Table 5-3. Exception Priorities 



Exception 
class 


Priority 


Exception 


Asynchronous, 
imprecise 


1 


System reset— The system reset exception has the highest priority of aii exceptions. 
If this exception exists, the exception mechanism ignores all other exceptions and 
generates a system reset exception. Instructions issued before the generation of a 
system reset exception cannot generate a nonmaskable exception. 


2 


f^^achine check — ^The machine check exception is the second-highest priority 
exception. If this exception occurs, the exception mechanism ignores all other 
exceptions (except reset) and generates a machine check exception. Instructions 
issued before the generation of a machine check exception cannot generate a 
nonmaskable exception. 


Synchronous, 
precise 


3 


Instruction dependent — When an instruction causes an exception, the exception 
mechanism waits for any instructions prior to the exception instruction in the 
instruction stream to execute. Any exceptions caused by these instructions are 
handled first. It then generates the appropriate exception if no higher priority 
exception exists when the exception is to be generated. 

Note that a single instruction can cause multiple exceptions. The ordering of such 
exceptions is described in 5.1.3, "Sequential Exception Processing." 


Asynchronous, 
precise 


4 


External interrupt — ^The external interrupt exception mechanism waits for instructions 
currently dispatched to complete execution. After all dispatched instructions are 
executed, and any exceptions caused by those instructions are handled, the 
exception mechanism generates this exception if no higher priority exception exists. 
This exception is delayed if MSR[EE] is cleared. 


5 


Decrementer — ^This exception is the lowest priority exception. When this exception is 
created, the exception mechanism waits for all other possible exceptions to be 
reported. It then generates this exception if no higher priority exception exists. This 
exception is delayed if MSR[EE] is cleared. 



5.1.3 Sequential Exception Processing 

Although more than one condition that can cause a precise exception can exist 
simultaneously, precise exceptions are handled sequentially in the MPC601 . The order in 
which exceptions are recognized is determined by program order and whether the 
exception is synchronous or asynchronous, precise or imprecise, and masked or 
nonmasked. 

Synchronous, precise exceptions (that is, exceptions that are caused by instructions) are 
handled in strict program order, even though instructions can execute and exceptions can 
be detected out of order. Therefore, before the MPC6()1 processes an instruction-caused 
exception, it executes all instructions, and handles any resulting exceptions, that appear 
earlier in the instruction stream. 

A single instruction may generate multiple exception conditions. Of these exceptions, the 
MPC6()1 handles the exception it encounters first, then the execution of the excepting 
instruction continues until the next excepting condition is encountered. In the POWER 
architecture, this feature is referred to as ordered exceptions. 
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If the exception is asynchronous and precise (namely an external interrupt or decrementer 
exception), the MPC601 synchronizes the pipeline by completing the execution of any 
instruction in the execute stage and any undispatched instructions that appear earlier in the 
instruction stream (including any exceptions they generate) before handling the external 
interrupt or decrementer exceptions. 

5.1.3.1 Recognition of Asynciironous, Imprecise Exceptions 

Exceptions that are nonmasked, imprecise, and asynchronous (namely system reset or 
machine check exceptions) may occur at any time. That is, these exceptions are not delayed 
if another exception is being handled. As a result, state information for the interrupted 
exception may be lost; therefore, these exceptions are typically non-recoverable. 

All other precise exceptions have lower priority than system reset and machine check 
exceptions, and they can be delayed. 

5.1.3.2 Recognition of Precise Exceptions 

Only one precise exception can be reported at a time. (Note that PowerPC implementations 
that support imprecise-mode floating-point enabled exceptions allow those to be handled 
in the same manner as described in this section.) 

Figure 5-1 illustrates the ordering of precise exceptions. Note that this ordering is on aper- 
instruction basis. If a precise, asynchronous exception condition occurs while instruction- 
caused exceptions are being processed, its handling is delayed until all instruction-caused 
exceptions are handled and the instruction completes execution. 

5.2 Exception Processing 

When an exception is taken, the MPC601 uses the save/restore registers, SRRO and SRRl, 
to save the contents of the machine state register for user-level mode and to identify where 
instruction execution should resume after the exception is handled. The save/restore 
register (SRRO) is a 32-bit register that the MPC601 uses to save either the address of the 
instruction that causes the exception, the one that follows, or the next instruction that would 
have executed in the case of an asynchronous, imprecise exception. This address is used 
when an rfi instruction is executed. The SRRO is shown in Figure 5-2. 
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Instruction Access 



Program Exception 
(Illegal/Privileged Instruction) 



rf i^ or mtmsr 



Integer 



Privileged Instruction 




Floating Point^ 



Precise Mode FP Enabled^ 



FP Unavailable 



System Call Alignment 



Data Access 



FP Alignment 



I/O Cont l/F 
Error 



Data Access 



Run Mode (including Trace^) 



External Interrupt 



Decrementer 



^ Not all floating-point instructions can cause enabled exceptions. 

^If the MSR bits FEO and FE1 are set such that precise mode floating-point enabled exceptions are 
enabled and the FPSCR[FEX] bit is set, a program exception will result. 

^Generating a trace exception after an rfi can cause unpredictable results. 

Figure 5-1. Recognition of Precise Exception Conditions 



SRRO (holds EA for resuming program execution) 



31 



Figure 5-2. IVJachiine Status Save/Restore Register 



When an exception occurs, SRRO is set to point to an instruction such that all prior 
instructions have completed execution and no subsequent instruction has begun execution. 
The instruction addressed by SRRO may not have completed execution, depending on the 
exception type. SRRO addresses either the instruction causing the exception or the 
immediately following instruction. The instruction addressed can be determined from the 
exception type and status bits. 
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The SRRl is a 32-bit register used to save machine status on exceptions and to restore 
machine status when rfi or sc is executed. The SRRl is shown in Figure 5-3. 



exception-specific 


MSR[16-31] 



15 16 



31 



Figure 5-3. Machine Status Save/Restore Register 1 



In general, when an exception occurs, bits ()-15 of SRRl are loaded with exception-specific 
information and bits 16-31 of the machine state register (MSR) are placed into bits 16-31 
of SRRl. The machine state register is shown in Figure 5-4 



HI Reserved 



QQDOQQOOOQQOOQD 



EE PR FP ME FEO SE FE1 D EP IT DT 00 



15 16 17 18 19 20 2122 23 24 25 26 27 28 29 30 31 

Figure 5-4. l\/lachine State Register 

Table 5-4 shows the bit definitions for the MSR. 

Table 5-4. IVIachine State Register Bit Settings 



Bit{s) 


Name 


Description 


0-15 


— 


Reserved 


16 


EE 


External interrupt enable 

The processor delays recognition of external interrupts and decrementer exception 
conditions. 

1 The processor is enabled to take an external interrupt or the decrementer exception. 


17 


PR 


Privilege level 

The processor can execute both user and supervisor privilege-level instructions. 

1 The processor can only execute user-level instructions. 


18 


FP 


Floating-point available 

The processor prevents dispatch of floating-point instructions, including floating-point 
loads, stores and moves. Floating-point enabled program exceptions can still occur and 
the FPRs can still be accessed. 

1 The processor can execute floating-point instructions, and can take floating-point 
enabled exception type program exceptions. 


19 


ME 


Machine check enable 

Machine check exceptions are disabled. 

1 Machine check exceptions are enabled. 


20 


FEO 


Floating-point exception mode (See Table 5-5.) 
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Table 5-4. Machine State Register Bit Settings (Continued) 



Bit(s) 


Name 


Description 


21 


SE 


Single-step trace enable 

The processor executes instructions normally. 

1 The processor generates a single-step trace exception upon the successful execution of 
the next instruction. When this bit is set, the processor dispatches instructions in strict 
program order. Successful execution means the instruction caused no other exception. 
Single-step tracing may not be present on all implementations. If the function is not 
implemented, MSR[SE] should be treated as a reserved MSR bit: mfmsr may return the 
last value written to the bit, or may return always. 


22 


— 


Reserved * 


23 


FE1 


Floating-point exception mode 1 (See Table 5-5.) 


24 


— 


Reserved. This bit corresponds to the AL bit of the POWER Architecture. 


25 


EP 


Exception prefix. The setting of this bit specifies whether an exception vector offset is 
prepended with Fs or Os. In the following description, nnnnn is the offset of the exception. See 
Table 5-7. 

Exceptions are vectored to the physical address x'000ri_nnnn'. 

1 Exceptions are vectored to the physical address x'FFF n_nnnn'. 


26 


IT 


Instruction address translation 

Instruction address translation is off. When instruction relocation is off, EA is interpreted 
as described in Chapter 6, "Memory Management Unit." 

1 Instruction address translation is enabled. 


27 


DT 


Data address translation 

Data address translation is off. When data relocation is off, EA is interpreted as 
described in Chapter 6, "Memory Management Unit." 

1 Data address translation is enabled. 


28-29 


— 


Reserved 


30 


— 


Reserved * 


31 


— 


Reserved * 



•These reserved bits may be used by other PowerPC processors. Attempting to change these bits does 
not affect the operation of the processor. These bit positions always return a zero value when read. 

The floating-point exception mode bits are interpreted as shown in Table 5-5. For further 
details see Section 5.4.7.1 , "Floating-Point Enabled Program Exceptions." 



Table 5-5. Floating-Point Exception Mode Bits 



FEO 


FE1 


Mode 








Floating-point exceptions disabled 





1 


Floating-point imprecise nonrecoverable 


1 





Floating-point imprecise recoverable 


1 


1 


Floating-point precise mode 
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MSR bits 16-31 are guaranteed to be written to SRRl when tlie first instruction of the 
exception handler is encountered. 

The data address register (DAR) is a 32-bit register used by several exceptions (data access, 
I/O controller interface error, and alignment) to identify the address of a memory element. 

5.2.1 Enabling and Disabling Exceptions 

When a condition exists that causes an exception to be generated, it must be determined 
whether the exception is enabled for that condition. 

• Floating-point enabled exceptions (a type of program exception) can be disabled by 
clearing both MSR[FEO] and MSR[FE1]. If either or both of these bits are set, all 
floating-point exceptions are taken and cause a program exception. Other PowerPC 
processors may support imprecise floating-point exceptions. Individual conditions 
that can generate floating-point exceptions can be enabled and disabled with bits in 
the FPSCR register. 

• Asynchronous, precise exceptions are enabled by setting the MSR[EE] bit. When 
MSR[EE]=0, recognition of these exception conditions is delayed. MSR[EE] is 
cleared automatically when an exception is taken to delay recognition of conditions 
causing those exceptions. 

• A machine check exception can only occur if the machine check enable bit, 
MSR[ME], is set. If MSR[ME] is cleared, the processor goes directly into checkstop 
state when a machine-check exception condition occurs. 

• The run mode exception, which is used to set an instruction breakpoint, can be 
enabled and disabled using bits 8 and 9 of HIDl (HIDl [RM]). 

• The data address breakpoint can be enabled and disabled using bits 30 and 3 1 of the 
DABR (HID5[SA]). 

• System reset exceptions cannot be masked. 

5.2.2 Steps for Exception Processing 

After it is determined that the exception can be taken (by confirming that any instruction- 
caused exceptions occurring earlier in the instruction stream have been handled, and by 
confirming that the exception is enabled for the exception condition), the MPC601 does the 
following: 

1. The machine status save/restore register (SRRO) is loaded with an instruction 
address that depends on the type of exception. See the individual exception 
description for details about how this register is used for specific exceptions. 

2. Bits 0-15 of SRRl are loaded with 16 bits of information specific to the exception 
type. 

3. Bits 16-31 of SRRl are loaded with a copy of bits 16-31 of the MSR. 
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4. The MSR is set as described in Table 5-4. The new values take effect beginning with 
the fetching of the first instruction of the exception-handler routine located at the 
exception vector address. 

Note that MSR[IT] and MSR[DT] are cleared for all exception types; therefore, 
address translation is disabled for both instruction fetches and data access beginning 
with the first instruction of the exception-handler routine. 

5. Instruction fetch and execution resumes, using the new MSR value, at a location 
specific to the exception type. The location is determined by adding the exception's 
vector (see Table 5-7) to the base address determined by MSR[EP]. If EP is cleared, 
exceptions are vectored to the physical address x'000n_nnnn\ If EP is set, 
exceptions are vectored to the physical address x'FFFn_nnnn\ For a machine check 
exception that occurs when MSR[ME]=0 (machine check exceptions are disabled), 
the checkstop state is entered (the machine stops executing instructions). See 
Section 5.4.2, "Machine Check Exception (x'{X)200')." 

• The Iwarx and stwx instructions require special handling if a reservation is still set 
when an exception occurs. Exceptions clear reservations set with Iwarx (or Idarx). 

5.2.3 Returning from Supervisor i\/lode 

The Return from Interrupt (rfi) instruction performs context synchronization by allowing 
previously issued instructions to complete before returning to user mode. Execution of the 
rfi instruction ensures the following: 

• All previous instructions have completed to a point where they can no longer cause 
an exception. If a prior memory access causes an I/O controller interface error 
exception, the results must be determined before this instruction is executed. 

• Previous instructions complete execution in the context (privilege, protection, and 
address translation) under which they were issued. 

• The instructions following this instruction execute in the context established by this 
instruction. 

5.3 Process Switching 

The operating system should execute the following when processes are switched: 

• The sync instruction, to resolve any data dependencies between the processes and to 
synchronize the use of segment registers and SPRs. For an example showing use of 
the sync instruction, see Section 2.3.3.1, "Synchronization for Supervisor-Level 
SPRs, and Segment Registers." 

• The Isync instruction, to ensure that undispatched instructions not in the new 
process are not used by the new process 

• The stwcx. instruction, to clear any outstanding reservations, which ensures that an 
Iwarx instruction in the old process is not paired with an stwcx. in the new process. 
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Note that if an exception handler is used to emulate an instruction that is not implemented 
in the MPC601 , the exception handler must report in SRRO (and in the data address register 
[DAR] if applicable) the EA computed by the instruction being emulated and not one used 
to emulate the instruction being emulated. 

5.4 Exception Definitions 

Table 5-6 shows all the types of exceptions that can occur with the MPC601 and the MSR 
bit settings when the processor transitions to supervisor mode. The state of these bits prior 
to the exception is typically stored in SRRl . 







Table 5-6. MSR Setting Due to Exception 








Exception 
Type 


MSR Bit 


EE 

16 


PR 

17 


FP1 

18 


ME 
19 


FEO 

20 


SE 

21 


FE1 

23 


EP 

25 


IT 

26 


DT 

27 


SF 
31 


Soft reset 











— 











— 











Machine check 























— 











Data access 











— 











— 











Instruction 
access 











— 











— 











External 











— 











— 











Alignment 











— 











— 











Program 











— 











— 











Floating-point 
unavailable 











— 











— 











Decrementer 











— 











— 











System call 











— 











— 











Run mode 
exception 











— 











— 











I/O controller 
interface error 
exception 











" 











' 












Bit is cleared 

1 Bit is set 

— Bit is not altered 

Reserved bits are read as if written as 0. 

The setting of the exception prefix (EP) bit in the MSR determines how exceptions are 
vectored. If the bit is cleared, exceptions are vectored to the physical address x'OOOnjmnn' 
(where nnnnn is the vector offset); if EP is set, exceptions are vectored to the physical 
address x'FFF n_nnnn\ Table 5-7 shows the exception vector offset of the first instruction 
of the exception handler routine for each exception type. 
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Table 5-7. Exception Vector Offset Table 



Vector Offset 
(hex) 


Exception Type 


00000 


Reserved 


00100 


System reset 


00200 


Machine check 


00300 


Data access 


00400 


Instruction access 


00500 


External interrupt 


00600 


Alignment 


00700 


Program 


00800 


Floating-point unavailable 


00900 


Decrementer 


OOAOO 


I/O controller interface error 


OOBOO 


Reserved. Note that other PowerPC processors may use this as a 
vector for the trace exception. 


OOCOO 


System call 


OODOO 


Reserved. Other PowerPC processors may use this as a vector for 
the trace exception. 


OOEOO 


Reserved. Other PowerPC processors may use this vector for 
floating-point assist exceptions. 


00E10-00FFF 


Reserved 


01 000-01 FFF 


Reserved. Other PowerPC processors may use this range for 
implementation-specific exceptions. 


02000 


Run-mode exception (including the trace exception for the MPC601) 


02001-03FFF 


Reserved 



5.4.1 Reset Exceptions (x'00100') 

The system reset exception is a nonmaskable, asynchronous exception signaled to the 
MPC601 either through the assertion of either of the reset signals (PRESET or HRESEl). 
The assertion of the soft reset signal,~5RE5ET, as described in Section 8.2.9.4.2, "Soft 
Reset (SRESET) — Input," causes the soft reset exception to be taken and the physical base 
address of the handler is determined by the MSR[EP] bit. The assertion of the hard reset 
signal, HRESET, as described in Section 8.2.9.4.1, "Hard Reset (HRE5ET)— Input," 
causes the hard reset exception to be taken and the physical address of the handler is always 
x'FFFOOlOO' 
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5.4.1.1 Soft Reset 

Soft reset exceptions are imprecise — they break the instruction pipeline to handle the 
exception. As a result, the MPC601 does not support restarting the interrupted process; 
although it attempts to save the processor state in order to perform diagnostic operations. 
When a soft reset exception occurs, registers are set as shown in Table 5-8. 

Table 5-8. Soft Reset Exception— Register Settings 



Register 



SRRO 



SRR1 



MSR 



Setting Description 



Set to the effective address of tiie instruction tliat tlie processor would have attempted to execute next 
if no exception conditions were present. 



0-15 Cleared 

16-31 Loaded from bits 16-31 of the MSR. Note that if the processor state is corrupted to the extent 
that execution cannot be reliably restarted, SRR1[30] is cleared. 



EE 





SE 





PR 





FE1 





FP1 





EP 


— 


IVIE 


— 


IT 





FEO 





DT 






When a soft reset exception is taken, instruction execution resumes at offset x '00 100' from 
the physical base address indicated by MSR[EP]. 

Before returning to the main program, the exception handler should do the following: 

1. SRRO and SRRl should be given the values used by the rfi instruction. 

2. Execute rfi. 

It is not guaranteed that execution is recoverable. Typically, the processor is recoverable in 
a limited sense, if at all. This allows the use of diagnostic aids such as the ESP interface to 
determine system problems. 

5.4.1.2 Hard Reset 

This section describes the MPC60rs reset state after performing a hard reset operation 
(asserting HRESET as described in Section 8.2.9.4.1, "Hard Reset (HRESET)— Input,"). 
Note that a hard reset operation should be performed on power-on to appropriately reset the 
processor. Table 5-9 shows the state of the machine just before it fetches the first instruction 
after a hard reset. Because of the setting of the MSR[EP] bit caused by a hard reset, the first 
instruction is fetched from address x'FFFO 0100'. 
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Table 5-9. Settings Caused by Hard Reset 



Register 


Setting 


GPRs 


Alios 


FPRs 


Alios 


FPSCR 


00000000 


Condition register 


Alios 


Segment registers 


Alios 


MSB 


00001040 


MQ 


00000000 


XER 


00000000 


RTCU 


00000000 


RTCL 


00000000^ 


Link register 


00000000 


CTR 


00000000 


DSISR 


00000000 


DAR 


00000000 


DEC 


00000000 


SDR1 


00000000 


SRRO 


00000000 


SRR1 


00000000 


SRGO 


00000000 


SRG1 


00000000 


SRG2 


00000000 


SRG3 


00000000 


EAR 


00000000 


PVR 


00010001^ 


BAT registers 


Alios 


HIDO 


800100802 


HID1 


00000000 


HID2 


00000000 


HID5 


00000000 


HID15 


00000000 


TLBs 


Alios 


Cache 


Alios 
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Table 5-9. Settings Caused by Hard Reset (Continued) 



Register 


Setting 


Tag directory 


All Os. (However, the LRU 
bits are initialized such 
that each side of the 
cache has a unique LRU 
value.) 



^ Early releases (DD1) of the MPC601 hardware set this 
to x'0001 0000'. Other versions of silicon may be 
different (see Section 2.3.3.10, "Processor Version 
Register (PVR)" for setting information). 
^ fvlaster checkstop enable on, sequencer GPR self-test 
checkstop invalid microcode instruction checkstop on. 
^ Note that if external clock is connected to RTC for the 
MPC601 , then the RTCL, RTCU, and DEC can change 
from their initial value of Os without receiving 
instructions to load those registers. 



The following is also true after a hard reset operation: 

• External checkstops are enabled, 

• The on-chip COP has given control of the PIs/POs to the rest of the chip for 
functional use. 

• Since the reset exception has data and instruction translation disabled (MSR[DT] 
and MSR[IT] both cleared), the chip operates in direct address translation mode. 
This implies that instruction fetches as well as loads and stores are cacheable. 
(Operations that correspond to direct address translations are implicitly cacheable, 
not write-through mode, and require coherency checking on the bus). 

• All internal arrays and registers are cleared during the hard reset process. 

5.4.2 Machine Check Exception (x'00200') 

The MPC6()1 conditionally initiates a machine-check exception after detecting the 
assertion of the TEK signal on the MPC601 interface. The assertion of the TEA signal 
indicates that a bus error occurred and the system terminates the current transaction. One 
clock cycle after TEA is asserted, the data bus signals go to the high-impedance state; 
however, data entering the GPR or the cache is not invalidated. 

If the MSR[ME] bit is set, the exception is recognized and handled; otherwise, the MPC6()1 
generates an internal checkstop condition. This may not lead to a true checkstop depending 
upon the state of the various checkstop enable control bits in the HIDO register. These are 
shown in Table 5-10. 

The checkstop sources and enables register (HIDO) is a supervisor-level register that 
defines enable and monitor bits for each of the checkstop sources in the MPC601. For 
debugging, HIDO[EM] (bit 16) can be cleared to disable the machine-check checkstop 
state. The HIDO register is described in Section 2,3,3.12,1, "Checkstop Sources and 
Enables Register— HIDO." 
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In general, it is expected that the TEA signal would be used by a memory controller to 
indicate a memory parity error or an uncorrectable memory ECC error. Note that the 
resulting machine check exception is imprecise and has priority over any exceptions caused 
by the instruction that generated the bus operation. 

Machine check exceptions are enabled when MSR[ME]=1; this is described in 
Section 5.4.2.1, "Machine Check Exception Enabled (MSRIME] = 1)." If MSR[ME]=0 
and a machine check occurs, the processor enters the checkstop state. Checkstop state is 
described in 5.4.2.2, "Checkstop State (MSR[ME] =())." 

5.4.2.1 Machine Check Exception Enabled (l\/ISR[l\/IE] = 1) 

When a machine check exception is taken, registers are updated as shown in Table 5-10. 

Table 5-10. Machine Check Exception— Register Settings 



Register 


Setting Description 


SRRO 


Set to the address of the next instruction that wouid have been executed in the interrupted 
instruction stream. Neither this instruction nor any others beyond it will have been executed. All 
preceding instructions will have been completed. 


SRR1 


0-1 5 Cleared 

16-31 Loaded from N/1SR[1 6-31]. Note that if the processor state is corrupted to the extent that 
execution cannot be reliably restarted, SRR1 [30] is cleared. 


MSR 


EE 

PR 

FP1 

ME 

Note that when a machine check exception is taken, the exception handler should set MSR[ME] 

as soon as it is practical to handle another TEA assertion. Otherwise, subsequent 1 \zK assertions 

cause the processor to automatically enter the checkstop state. 

FEO 

SE 

FE1 

EP Value is not altered 

IT 

DT 



The machine check exception is almost always unrecoverable in the sense that execution 
cannot resume in the same context that existed before the exception. If the condition that 
caused the machine check does not otherwise prevent continued execution, MSR[ME] is 
set to allow the MPC601 to continue execution at the machine check exception vector 
address, x '00200'. Typically this record does not allow earlier processes to resume; 
however, the operating systems can then use the machine check exception handler to try to 
identify and log the cause of the machine check condition. 

When a machine check exception is taken, instruction execution resumes at offset x '00200' 
from the physical base address indicated by MSR[EP]. 
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Before returning to the main program, the exception handler should do the following: 

1. SRRO and SRRl should be given the values to be used by the rfi instruction. 

2. Execute rfi. 

5.4.2.2 Checkstop State (MSR[ME] = 0) 

When a processor is in the checkstop state, instruction processing is suspended and 
generally cannot be restarted without resetting the processor. The contents of all latches 
(except any associated with the bus clock) are frozen within two cycles upon entering 
checkstop state so that the state of the processor can be analyzed through the use of the ESP 
interface as an aid in problem determination. 

A machine check exception may result from referencing a nonexistent physical address, 
either directly (with MSR[DR]=0), or through an invalid translation. On such a system, for 
example, execution of a Data Cache Block Set to Zero (dcbz) instruction that introduces a 
block into the cache associated with a nonexistent physical address may delay the machine 
check exception until an attempt is made to store that block to main memory. 

Note that not all PowerPC processors provide the same level of error checking. The reasons 
a processor can enter checkstop state is implementation-dependent. 

5.4.3 Data Access Exception (x'00300') 

A data access exception occurs when no higher priority exception exists and a data memory 
access cannot be performed. The condition that caused the data access exception can be 
determined by reading the DAE/source instruction service register (DSISR), a supervisor- 
level SPR (SPR18) that can be read by using the mfspr instruction. Bit settings are 
provided in Table 5-11. Table 5-11 also indicates which memory element is saved to the 
DAR. Data access exceptions can occur for any of the following reasons: 

• The effective address cannot be translated. That is, there is a page fault for this 
portion of the translation, so a data access exception must be taken to retrieve the 
translation from a storage device such as a hard disk drive. 

• The instruction is not supported for the type of memory addressed. Invalid 
instructions are described in Section 3.1.1.1, "Invalid Instruction Forms." I/O 
controller interface segments are described in Section 9.6, "I/O Controller Interface 
Operation." 

• The access violates memory protection. Access is not permitted by the key (Ks and 
Ku) and PP bits, which are set in the segment register and PTE for page protection 
and in the BATs for block protection. 

• The execution of an eciwx or ecowx instruction is disallowed because the external 
access register enable bit (EAR[E]) is cleared. 
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These scenarios are common among all PowerPC processors. The following additional 
scenarios can cause a data access exception in the MPC601 : 

• An Iwarx, stwcx., or Iscbx instruction refers to a non-memory-forced I/O controller 
interface segment (that is, when the SR[T] = 1 and BUID t- x'()7F'). 

• An effective address matches the address in the data-address breakpoint register 
(DABR) while in one of the appropriate compare modes. For additional information 
on the DABR and the compare modes, refer to Section 2.3.3.12.4, "Data Address 
Breakpoint Register (DABR)— HID5." 

Data access exceptions can be generated by load/store instructions, and the cache control 
instructions (dcbi, dcbz, dcbst, and dcbf). 

Although the MPC601 does not generally support memory accesses that cross a page 
boundary, load or store multiple as well as load or store string instructions that are word- 
aligned and cross a page boundary are handled. In these cases, if the second page has a 
translation error or protection violation associated with it, the MPC601 takes the data 
access exception in the middle of the instruction. In this case, the data address register 
(DAR) always points to the first byte address of the offending page. 

If an stwcx. instruction has an effective address for which a normal store operation would 
cause a data access exception but the processor does not have the reservation from Iwarx, 
the MPC601 determines whether a data access exception occurs as follows: 

• If the reservation bit is cleared before the stwcx. instruction executes, that 
instruction cannot generate an exception, regardless of whether the address 
translation would have failed, page protection would have been violated, or the 
address matches one in the DABR, 

• A data access exception is taken if there is an address translation or page protection 
error, or if the address hits in the DABR as long as the reservation bit is set when the 
stwcx. instruction begins execution. In particular, the exception is taken even if the 
reservation bit is cleared after execution begins. 

If the XER indicates that a load or store multiple instruction has a length of zero, a data 
access exception does not occur, regardless of the effective address. The condition that 
caused the exception is defined in the DSISR. These conditions also use the data address 
register (DAR) as shown in Table 5-11. 

Table 5-11 shows the register settings for data access exceptions. 
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Table 5-11. Data Access Exception— Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of the instruction that caused the exception. 


SRR1 


0-1 5 Cleared 

1 6-31 Loaded from bits 1 6-31 of the MSR 


MSR 


EE PR 

FP1 ME Value is not altered 

FEO SE 

FE1 EP Value is not altered 

IT DT 


DSISR 


Reserved on the MPC601 . PowerPC architecture uses this bit for I/O controller interface error 
exceptions, which are vectored to x'OOAOO' on the MPC601 . 

1 Set if the translation of an attempted access is not found in the primary hash table entry group 
(HTEG), or in the renashed secondary HTEG, or in the range of a BAT register; otherwise 
cleared. 

2-3 Cleared 

4 Set if a memory access is not permitted by the page or BAT protection mechanism; otherwise 
cleared. ^ 

5 Set if the Iwarx, stwcx., or Iscbx instruction is attempted to I/O controller interface space. 

6 Set for a store operation and cleared for a load operation. 
7-8 Cleared 

9 Set if an EA matches the address in the DABR while in one of the three compare modes. 

1 Set if the segment table search fails to find a translation for the EA, otherwise cleared. 

11 Set if the instruction was an eciwx or ecowx and EAR[E] = 0. 
12-31 Cleared 


DAR 


Set to the effective address of a memory element as described in the following list: 

♦ A byte in the first word accessed in the page that caused the data access exception, for a byte, half 
word, or word memory access. 

• A byte in the first double word accessed in the page that caused the data access exception, for a 
double-word memory access. 



When a data access exception is taken, instruction execution resumes at offset x'0{)3(K)' 
from the physical base address indicated by MSR[EP]. 

The architecture permits certain instructions to be partially executed when they cause a data 
access exception. These are as follows: 

• Load multiple or load string instructions — Some registers in the range of registers 
to be loaded may have been loaded. 

• Store multiple or store string instructions — Some bytes of memory in the range 
addressed may have been updated. 

In the cases above, the questions of hov/ many registers and how much memory is altered 
are instruction- and boundary-dependent. However, memory protection is not violated. 
Furthermore, if some of the data accessed is in I/O controller interface space (SR[T]=1), 
and the instruction is not supported for I/O controller interface accesses, the locations in 
I/O controller interface space are not accessed. 
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To preserve the ability to restart, partial execution is not allowed for non-multiple and non- 
string integer load operations and the target register is not altered. For update forms, the 
update register (rA) is not altered. 

5.4.4 Instruction Access Exception (x'00400') 

An instruction access exception occurs when no higher priority exception exists and an 
attempt to fetch the next instruction to be executed cannot be performed for any of the 
following reasons: 

• The effective address cannot be translated. That is, there is a page fault for this 
portion of the translation, so an instruction access exception must be taken to 
retrieve the translation from a storage device such as a hard disk drive. 

• The fetch access is to an I/O controller interface segment that is not memory-forced. 

• The fetch access violates memory protection. Access is not permitted by the K and 
PP bits, which are set in the segment register and PTE for page protection and in the 
BATs for block protection. 

An instruction fetch to an I/O controller interface segment while MSR[IT] is set causes an 
instruction access exception on the MPC6()1. Register settings for instruction access 
exceptions are shown in Table 5-12. 

Table 5-12. instruction Access Exception— Register Settings 



Register 



SRRO 



SRR1 



MSR 



Setting 



Set to the effective address of the instruction that the processor would have attempted to execute next 
if no exception conditions were present (if the exception occurs on attempting to fetch a branch target, 
SRRO is set to the branch target address). 



Cleared 

1 Set if the translation of an attempted access is not found in the primary hash table entry group 
(HTEG), or in the rehashed secondary HTEG, or in the range of an BAT register; otherwise 
cleared. 

2 Cleared 

3 Set if the fetch access was to an I/O controller interface segment (SR[T]=1); otherwise cleared. 

4 Set if a memory access is not permitted by the page or BAT protection mechanism, described in 
Chapter 6; otherwise cleared. 

Cleared 

Set if the segment table search fails to find a translation for the effective address; otherwise 
cleared. 

11-15 Cleared 

16-31 Loaded from bits 16-31 of the f^SR 



5-9 
10 



EE 





SE 





PR 





FE1 





FP1 





EP 


Value is not altered 


ME 


Value is not altered 


IT 





FEO 





DT 






When an instruction access exception is taken, instruction execution resumes at offset 
x'0040()' from the physical base address indicated by MSR[EP]. 
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5.4.5 External Interrupt (x'00500') 

An external interrupt is signaled to the MPC6()1 by the assertion of the TNT signal as 
described in Section 8.2.9.1, "Interrupt (INT) — Input." The interrupt may be delayed by 
other higher priority exceptions or if the MSR[EE] bit is cleared when the exception occurs. 

After the pulse is detected, the MPC601 stops dispatching instructions and waits for 
pending instructions to complete. Therefore, exceptions caused by instructions in progress 
are taken before the external interrupt exception is taken. After all instructions complete, 
the MPC601 takes the external interrupt exception. 

The register settings for the external interrupt exception are shown in Table 5-13. 

Table 5-13. External Interrupt— Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of the instruction that the processor would have 
if no interrupt conditions were present. 


attempted to execute next 


SRR1 


0-15 Cleared 

16-31 Loaded from bits 16-31 of the l\yiSR 


N 


MSR 


EE SE 

PR FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 



When an external interrupt exception is taken, instruction execution resumes at offset 
x'0()5()()' from the physical base address indicated by MSR[EP]. 

5.4.6 Alignment Exception (x'00600') 

This section describes conditions that can cause alignment exceptions in the MPC601. 
Similar to data access exceptions, alignment exceptions use the SRRO and SRRl to save 
the machine state and the DSISR to determine the source of the exception. 

The register settings for alignment exceptions are shown in Table 5-14. 
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Table 5-14. Alignment Exception — Register Settings 



Register 



SRRO 



SRR1 



MSR 



DSISR 



Setting Description 



Set to the effective address of the instruction that caused the exception. 



0-15 Cleared 

1 6-31 Loaded from bits 1 6-31 of the MSR 



EE 

PR 

FP1 

ME Value is not altered 

FEO 



SE 





FE1 





EP 


Value is not altered 


IT 





DT 






0-11 Cleared 

12-13 Cleared. (Note that these bits can be set by several 64-bit PowerPC 

instructions that are not supported in the MPC601) 
14 Cleared 
15-16 For instructions that use register indirect with index addressing — set 

to bits 29-30 of the instruction 

For instructions that use register indirect with immediate index 

addressing — cleared 
17 For instructions that use register indirect with index addressing — Set 

to bit 25 of the instruction 

For instructions that use register indirect with immediate index 

addressing — Set to bit 5 of the instruction 
18-21 For instructions that use register indirect with index addressing — Set 

to bits 21-24 of the instruction 

For instructions that use register indirect with immediate index 

addressing — set to bits 1—4 of the instruction 
22-26 Set to bits 6-1 (source or destination) of the instruction. Undefined 

for dcbz 
27-31 Set to bits 11-15 of the instruction (rA) 

Set to either bits 11-15 of the instruction or to any register number not 

in the range of registers loaded by a valid form instruction, for Imw, 

Iswi, and Iswx instructions. Otherwise undefined 
Note that for load or store instructions that use register indirect with index 
addressing, the DSISR can be set to the same value that would have 
resulted if the corresponding instruction uses register indirect with immediate 
index addressing had caused the exception. Similarly, for load or store 
instructions that use register indirect with immediate index addressing, 
DSISR can hold a value that would have resulted from an instruction that 
uses register indirect with index addressing. For example, an unaligned Iwax 
instruction that crosses a protection boundary would normally cause the 
DSISR to be set to the following binary value: 

000000000000 00 01 0101 ttttt ????? 
The value ttttt refers to the destination and ????? indicates undefined bits. 
However, this register may be set as if the instruction were Iwa, as follows: 

000000000000 1 00 1101 ttttt ????? 
If there is no corresponding instruction, no alternative value can be specified. 



5.4.6.1 Integer Alignment Exceptions 

The MPC601 is optimized for load and store operations that are aligned on natural 
boundaries. Operations that are not naturally aligned may suffer performance degradation, 
depending on the type of operation, the boundaries crossed, and the mode that the processor 
is in during execution. More specifically, these operations may either cause an alignment 
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exception or they may cause the processor to break the memory access into multiple, 
smaller accesses with respect to the cache and the memory subsystem. 

The MPC601 can initiate alignment exception for the accesses as shown in Table 5-15. In 
all of these cases, the appropriate range check is performed before the instruction begins 
execution. As a result, if an alignment exception is taken, it is guaranteed that no portion of 
the instruction has been executed. 

Table 5-15. Access Types 



MSR[DT1 


sRm 


SR[BUID] 


Access Type 








X 


Direct translation access 


X 


1 


Not x'07F' 


I/O controller interface access 


X 


1 


x'07F' 


Memory-forced I/O controller interface access 


1 





x 


Page-address translation access 



5.4.6.1 .1 Direct-Translation Access 

A direct-translation access occurs when both MSR[DT] and SR[T] are cleared. If a 
256-Mbyte boundary is crossed by any portion of the memory being accessed by an 
instruction (including string/multiples), an alignment exception is taken. 

5.4.6.1.2 I/O Controller Interface Access 

An I/O controller interface access occurs when a data access is initiated, SR[T] is set, and 
SR[BUID] is not equal to x'07F'. In the MPC601 (but not for the general PowerPC 
processor case), MSR[DT] is a don't care for this case. The following apply for I/O 
controller interface accesses: 

• If a 256-Mbyte boundary will be crossed by any portion of the I/O controller 
interface space accessed by an instruction (the entire string for strings/multiples), an 
alignment exception is taken. 

• Floating-point loads and stores to I/O controller interface segments always cause an 
alignment exception, regardless of operand alignment. 

• The lwarx/stwcx./lscbx instructions that map into an I/O controller interface 
segment always cause a data access exception (not an alignment exception), 
regardless of operand alignment. 

Note that other I/O controller interface errors may generate an I/O controller interface error 
exception, as described in Section 5.4.10, "I/O Controller Interface Error Exception 
(x'()()A()0')." 

5.4.6.1.3 Memory-Forced I/O Controller Interface Access 

A memory-forced I/O controller interface access occurs when SR[T] is set, and SR[BUID] 
is x'07F' in the MPC6()1 (not defined as part of the PowerPC architecture). MSR[DT] is a 
don't care for this case. 
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If a 256-Mbyte boundary is crossed by any portion of the memory being accessed by an 
instruction (including string/multiples), an alignment exception is taken. 

Note that floating-point instructions and Iwarx, stwcx., and Iscbx instructions are handled 
as the page- and block-address translation cases for the memory-forced I/O controller 
interface segments. Memory-forced I/O controller interface operations do not cause special 
cases of alignment or data access exceptions. 

5.4.6.1 .4 Page Address Translation Access 

A page-address translation access occurs when MSR[DT] is set, SR[T] is cleared and there 
is not a BTLB match. Note the following points: 

• The following is true for all loads and stores except strings/multiples: 

— An alignment exception is taken if the operand spans a 4-Kbyte boundary. 

— Byte operands never cause an alignment exception. 

— Half-word operands cause an alignment exception if the EA ends in x'FFF'. 

— Word operands cause an alignment exception if the EA ends in x'FFD-FFF'. 

— Double-word operands cause an alignment exception if the EA ends in x'FF9- 
FFF. 

• The Iscbx instruction causes an alignment exception if any portion of the entire 
string crosses into the next 4-Kbyte page of memory. This is taken regardless of the 
starting address, even if the Iscbx operand starts on a word boundary, 

• All other string/multiple instructions (except Iscbx) take alignment exceptions as 
follows: 

— If the string/multiple starts on a word boundary and a 256-Mbyte boundary is 
crossed by any portion of the entire string/multiple, an alignment exception is 
taken. Note that it must be a 256-Mbyte crossing — a simple 4-Kbyte crossing 
does not cause an exception for a word-aligned string/multiple operation. 

— If any portion of the string/multiple will cross into the next 4-Kbyte page of 
memory, an alignment exception is taken. 

• The desz instruction causes an alignment exception if the access is to a page or block 
with the W (write-through) or I (cache-inhibit) bit set in the UTLB or BTLB, 
respectively. 

Note that the above summary indicates that a 256-Mbyte crossing always causes an 
alignment exception. This includes accesses of all four types regardless of alignment. Of 
course, non-string/multiple load and store operations can only cross this boundary if they 
are not aligned. 

Misaligned memory accesses that do not cause an alignment exception may not perfonn as 
well as an aligned access of the same type. In general, the lU is designed to efficiently 
handle memory access quantities of eight bytes or fewer that lie within a double-word 
boundary. Internally, all integer memory access instructions that involve more than four 



5-28 PowerPC 601 RISC Microprocessor User's Manual MOTOROLA 



bytes of data are broken into multiple access of four bytes or fewer. Floating-point memory 
access instructions always involve either four or eight bytes of data. Any memory access 
that crosses a double-word boundary is further broken into two smaller accesses that do not 
cross the double-word boundary. For multiple-word and string operations, the MPC601 
does not force alignment to reduce the number of accesses. 

The resulting performance degradation due to misaligned accesses depends on how well 
each individual access behaves with respect to the memory hierarchy. At a minimum, 
additional cache access cycles are required that can delay other processor resources from 
using the cache. More dramatically, for an access to a noncacheable page, each discrete 
access involves individual MPC601 bus operations that reduce the effective bandwidth of 
that bus. 

Finally, note that when the MPC601 is in page address translation mode, there is no special 
handling for accesses that fall into BAT regions. If one of the 4-Kbyte crossing conditions 
indicated above happens to be completely contained within a BAT register, the MPC6()1 
still takes the alignment exception. 

5.4.6.2 Floating-Point Alignment Exceptions 

An alignment exception occurs when no higher priority exception exists and the MPC601 
cannot perform a memory access for one of the following reasons: 

• The operand of a floating-point load or store operation is in a non-memory-forced 
I/O controller interface segment (SR[T]=1). 

• The operand of a load or store crosses a 4-Kbyte boundary if MSR[DT] is set or if 
the operand crosses a 256-Mbyte boundary if MSR[DT] is cleared. 

5.4.6.3 Little-Endian Mode Alignment Exceptions 

In little-endian mode, any operand that is not properly aligned (as described in 
Section 2.4.6, "PowerPC Data Memory with LM Set"), causes an alignment exception. 
Additionally, any attempted execution of the string/multiple instructions causes an 
alignment exception. 

5.4.6.4 Interpretation of tlie DSISR as Set by an Alignment Exception 

For most alignment exceptions, an exception handler may be designed to emulate the 
instruction that causes the exception. To do this, it needs the following characteristics of the 
instruction: 

• Load or store 

• Length (half word, word, or double word) 

• String, multiple, or normal load/store 

• Integer or floating-point 

• Whether the instruction performs update 

• Whether the instruction performs byte reversal 

• Whether it is a dcbz instruction 
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The PowerPC architecture provides this information implicitly, by setting opcode bits in the 
DSISR that identify the excepting instruction type. The exception handler does not need to 
load the excepting instruction from memory. The mapping for all exception possibilities is 
unique except for the few exceptions discussed below. 

Table 5-16 shows the inverse mapping — how the DSISR bits identify the instruction that 
caused the exception. 

The alignment exception handler cannot distinguish a floating-point load or store that 
causes an exception because it is misaligned, or because it addresses the I/O controller 
interface space. However, this does not matter; in either case it is emulated with integer 
instructions. 

Table 5-16. DSISR(15-21) Settings to Determine IVIisaligned Instruction 



DSISR[15-21] 


Instruction 


00 0000 


Iwarx, Iwz, proprietary ^ 


00 0010 


stw 


00 0100 


Ihz 


00 0101 


Iha 


00 0110 


sth 


00 0111 


Imw 


00 1000 


Its 


00 1001 


ltd 


00 1010 


stfs 


00 1011 


stfd 


00 1 0000 


Iwzu 


00 1 0010 


stwu 


00 1 0100 


Ihzu 


00 1 0101 


Ihau 


00 1 0110 


sthu 


00 1 0111 


stmw 


00 1 1000 


Ifsu 


00 1 1001 


Ifdu 


00 1 1010 


stfsu 


00 1 1011 


stfdu 


01 0101 


Iwax 


01 1000 


iswx 


01 1001 


Iswi 
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Table 5-16. DSISR(15-21) Settings to Determine l\1isaligned Instruction (Continued) 



01 1010 


stswx 


01 1011 


stswi 


01 1 0101 


Iwaux 


10 0010 


stwcx. 


10 1000 


Iwbrx 


1001010 


stwbrx 


1001100 


Ihbrx 


1001110 


sthbrx 


101 1111 


dcbz 


11 0000 


Iwzx 


11 0010 


stwx 


11 0100 


Ihzx 


11 0101 


max 


11 0110 


sthx 


11 1000 


Ifsx 


11 01001 


Ifdx 


11 1010 


stfsx 


11 1011 


stfdx 


11 1 0000 


Iwzux 


11 1 0010 


stwux 


11 1 0100 


Ihzux 


11 1 0101 


Ihaux 


11 1 0110 


sthux 


11 1 1000 


Ifsux 


11 1 1001 


Ifdux 


11 1 1010 


stfsux 


11 1 1011 


stfdux 



1 The instructions Iwz and Iwarx give the same DSISR bits (all zero). But if Iwarx causes an 
alignment exception, it is an invalid form, so it need not be emulated in any precise way. It is 
adequate for the alignment exception handler to simply emulate the instruction as if it were an 
Iwz. It is important that the emulator use the address in the DAR, rather than computing it from 
rA/rB/D, because Iwz and Iwarx use different addressing modes. 
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5.4.7 Program Exception (x'00700') 

A program exception occurs when no higher priority exception exists and one or more of 
the following exception conditions, which correspond to bit settings in SRRl , occur during 
execution of an instruction: 

• System floating-point enabled exception — A system floating-point enabled 
exception is generated when the following condition is met: 

(MSR[FEO] I MSR[FE1]) & FPSCR[FEX] is 1. 

FPSCR[FEX] is set by the execution of a floating-point instruction that causes an 
enabled exception or by the execution of a "move to FPSCR" type instruction that 
sets an exception bit when its corresponding enable bit is set. In the MPC601 , all 
floating-point enabled exceptions are handled in a precise manner. As a result, all 
program exceptions taken on behalf of a floating-point enabled exception clear 
SRRl [15] to indicate that the address in SRRO points to the instruction that caused 
the exception. For more information, refer to Section 5.4.7.1, "Floating-Point 
Enabled Program Exceptions." 

• Illegal instruction — An illegal instruction program exception is generated when 
execution of an instruction is attempted with an illegal opcode or illegal 
combination of opcode and extended opcode fields, or when execution of an 
optional instruction not provided in the MPC6()1 is attempted (these do not include 
those optional instructions, such as Instruction Cache Block Invalidate, icbi, that are 
treated as no-ops). 

• Privileged instruction — A privileged instruction type program exception is 
generated when the execution of a privileged instruction is attempted and the MSR 
register privileged bit, MSR[PR], is set. Some implementations may generate this 
exception for mtspr or mfspr with an invalid SPR field if spro=l and MSR[PR]=1. 

• Trap — A trap type program exception is generated when any of the conditions 
specified in a trap instruction is met. Trap instructions are described in Chapter 3, 
"Addressing Modes and Instruction Set Summary." 

• Illegal operations — ^The MPC601 takes illegal operation program exceptions for 
unimplemented PowerPC instructions. 

Note that instructions using an invalid instruction form do not take a program exception, 
but instead cause results that are boundedly undefined. The register settings are shown in 
Table 5-17. 
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Table 5-17. Program Exception— Register Settings 



Register 



SRRO 



SRR1 



MSR 



Setting Description 



Contains the effective address of the excepting instruction 



0-1 Cleared 

11 Set for a floating-point enabled program exception; otherwise cleared. 

12 Set for an illegal instruction progranfi exception; otherwise cleared. 

13 Set for a privileged instruction program exception; otherwise cleared. 

14 Set for a trap program exception; otherwise cleared. 

15 Cleared if SRRO contains the address of the instruction causing the exception, and set if SRRO 
contains the address of a subsequent instruction. 

16-31 Loaded from bits 16-31 of the MSR. 
Note that only one of bits 11-14 can be set. 



EE 





PR 





FP1 





ME 


Value is not altered 


FEO 





SE 





FE1 





EP 


Value is not altered 


IT 





DT 






When a program exception is taken, instruction execution resumes at offset x'00700' from 
the physical base address indicated by MSR[EP]. 

5.4.7.1 Floating-Point Enabled Program Exceptions 

In the MPC601, floating-point exceptions are signaled by condition bits set in the floating- 
point status and control register (FPSCR). They can cause the system floating-point enabled 
exception error handler to be invoked. All floating-point exceptions are handled precisely. 
The FPSCR is shown in Figure 5-5. 



FPSCR 



VXIDI 

VXISI 

VXSNAN 



FX FE> VX OX UX ZX XX 



VXZDZ 
VXIMZ 
VXVC 



FR Fl FPRF 



r 



Reserved 
VXSOFT 
VXSQRT 
VXCVI 



VE OE UE ZE XE 



RN 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 19 20 2122 23 24 25 26 27 28 29 30 31 

Figure 5-5. Floating-Point Status and Control Register 

A listing of FPSCR bit settings is shown in Table 5-18. 
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Table 5-18. FPSCR Bit Settings 



Bit(s) 


Description 





Floating-point exception summary (FX). Every floating-point instruction implicitly sets FPSCR[FX] if that 
instruction causes any of the floating-point exception bits in the FPSCR to transition from to 1 . The 
mcrfs instruction implicitly clears FPSCR[FX] if the FPSCR field containing FPSCRpx is copied. The 
mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions can set or clear FPSCR[FX] explicitly. This is a sticky bit. 


1 


Floating-point enabled exception summary (FEX). This bit signals the occurrence of any of the enabled 
exception conditions. It is the logical OR of all the floating-point exception bits masked with their 
respective enables. The mcrfs instruction implicitly clears FPSCR[FEX] if the result of the logical OR 
described above becomes zero. The mtfsf, mtfsfi, mtfsbO, and mtfsbl instructions cannot set or clear 
FPSCR[FEX] explicitly. This is not a sticky bit. 


2 


Floating-point invalid operation exception summary (VX). This bit signals the occurrence of any invalid 
operation exception. It is the logical OR of all of the invalid operation exceptions. The mcrfs implicitly 
clears FPSOR[VX] if the result of the logical OR described above becomes zero. The mtfsf, mtfsfi, 
mtfsbO, and mtfsbl instructions cannot set or clear FPSCR[VX] explicitly. This is not a sticky bit. 


3 


Floating-point overflow exception (OX). This is a sticky bit. 


4 


Floating-point underflow exception (UX). This is a sticky bit. 


5 


Floating-point zero divide exception (ZX). This is a sticky bit. 


6 


Floating-point inexact exception (XX). This is a sticky bit. 


7 


Floating-point invalid operation exception for SNaN (VXSNAN). This is a sticky bit. 


8 


Floating-point invalid operation exception for oo-oo (VXISI). This is a sticky bit. 


9 


Floating-point invalid operation exception for «./oo (VXIDI). This is a sticky bit. 


10 


Floating-point invalid operation exception for 0/0 (VXZDZ). This is a sticky bit. 


11 


Floating-point invalid operation exception for =0*0 (VXIMZ). This is a sticky bit. 


12 


Floating-point invalid operation exception for invalid compare (VXVC). This is a sticky bit. 


13 


Floating-point fraction rounded (FR).The last floating-point instruction that potentially rounded the 
intermediate result incremented the fraction. 


14 


Floating-point fraction inexact (Fl). The last floating-point instruction that potentially rounded the 
intermediate result produced an inexact fraction or a disabled exponent overflow. 


15-19 


Floating-point result flags (FPRF). This field is based on the value placed into the target register even if 

that value is undefined. Refer to Table 2-2 for specific bit settings. 

15 Floating-point result class descriptor (C). Floating-point instructions other than the compare 
instructions may set this bit with the FPCC bits, to indicate the class of the result. 

16-19 Floating-point condition code (FPCC). Floating-point compare instructions always set one of 
the FPCC bits to one and the other three FPCC bits to zero. Other floating-point instructions 
may set the FPCC bits with the C bit, to indicate the class of the result. Note that in this case the 
high-order three bits of the FPCC retain their relational significance indicating that the value is 
less than, greater than, or equal to zero. 

1 6 Floating-point less than or negative (FL or <) 

1 7 Floating-point greater than or positive (FG or >) 

1 8 Floating-point equal or zero (FE or =) 

19 Floating-point unordered or NaN (FU or ?) 


20 


Reserved 
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Table 5-18. FPSCR Bit Settings (Continued) 



Bit(s) 


Description 


21 


Floating-point invalid operation exception for software request (VXSOFT). This bit can be altered only by 
the mcrfs, mtfsfi, mtfsf, mtfsbO, or mtfsbl instructions. The purpose of VXSOFT is to allow software to 
cause an invalid operation condition for a condition that is not necessarily associated with the execution of 
a floating-point instruction. For example, it might be set by a program that computes a square root if the 
source operand is negative. This is a sticky bit. 


22 


Floating-point invalid operation exception for invalid square root (VXSQRT). This is a sticky bit. This 
guarantees that software can simulate fsqrt and frsqrte, and to provide a consistent interface to handle 
exceptions caused by square-root operations. 


23 


Floating-point invalid operation exception for invalid integer convert (VXCVI). This is a sticky bit. See 
Section 5.4.7.2, "Invalid Operation Exception Conditions." 


24 


Floating-point invalid operation exception enable (VE) 


25 


Floating-point overflow exception enable (OE) 


26 


Floating-point underflow exception enable (UE). This bit should not be used to determine whether 
denormalization should be performed on floating-point stores 


27 


Floating-point zero divide exception enable (ZE) 


28 


Floating-point inexact exception enable (XE) 


29 


Reserved. This bit may be implemented as the non-IEEE mode bit (Nl) in other PowerPC implementations. 


30-31 


Floating-point rounding control (RN). 

00 Round to nearest 

01 Round toward zero 

10 Round toward +infinity 

11 Round toward -infinity 



The following conditions that can cause program exceptions are detected by the processor. 
These conditions may occur during execution of floating-point arithmetic instructions. The 
corresponding bits set in the FPSCR are indicated in parentheses. 

• Invalid floating-point operation exception condition (VX) 

— SNaN condition (VXSNAN) 

— Infinity-intinity condition (VXISI) 

— Infinity/infinity condition (VXIDI) 

— Zero/zero condition (VXZDZ) 

— Infinity*zero condition (VXIMZ) 

— Illegal compare condition (VXVC) 

These exception conditions are described in Section 5.4.7.2, "Invalid Operation 
Exception Conditions." 

• Software request condition (VXSOFT). These exception conditions are described in 
Section 5.4.7.2, "Invalid Operation Exception Conditions." 

• Illegal integer convert condition (VXCVI). These exception conditions are 
described in Section 5.4.7.2, "Invalid Operation Exception Conditions." 
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• Zero divide exception condition (ZX). These exception conditions are described in 
Section 5.4.7.3, "Zero Divide Exception Condition." 

• Overflow Exception Condition (OX). These exception conditions are described in 
Section 5.4.7.4, "Overflow Exception Condition." 

• Underflow Exception Condition (UX). These exception conditions are described in 
Section 5.4.7.5, "Underflow Exception Condition." 

• Inexact Exception Condition (XX). These exception conditions are described in 
Section 5.4.7.6, "Inexact Exception Condition." 

Each floating-point exception condition and each category of illegal floating-point 
operation exception condition, has a corresponding exception bit in the FPSCR. In addition, 
each floating-point exception has a corresponding enable bit in the FPSCR. The exception 
bit indicates the occurrence of the corresponding condition. If a floating-point exception 
occurs, the corresponding enable bit governs the result produced by the instruction and, in 
conjunction with bits FEO and FEl, whether and how the system floating-point enabled 
exception error handler is invoked. (The "enabling" specified by the enable bit is of 
invoking the system error handler, not of permitting the exception condition to occur. The 
occurrence of an exception condition depends only on the instruction and its inputs, not on 
the setting of any control bits.) 

The floating-point exception summary bit (FX) in the FPSCR is set when any of the 
exception condition bits transitions from a zero to a one or when explicitly set by software. 
The floating-point enabled exception summary bit (FEX) in the FPSCR is set when any of 
the exception condition bits is set and the exception is enabled (enable bit is one). 

A single instruction may set more than one exception condition bit in the following cases: 

• The inexact exception condition bit may be set with overflow exception condition. 

• The inexact exception condition bit may be set with underflow exception condition. 

• The illegal floating-point operation exception condition bit (SNaN) may be set with 
illegal floating-point operation exception condition (oo*0) for multiply-add 
instructions. 

• The illegal operation exception condition bit (SNaN) may be set with illegal 
floating-point operation exception condition (illegal compare) for compare ordered 
instructions. 

• The illegal floating-point operation exception condition bit (SNaN) may be set with 
illegal floating-point operation exception condition (illegal integer convert) for 
convert to integer instructions. 

When an exception occurs, the instruction execution may be suppressed or a result may be 
delivered, depending on the exception condition. 
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Instruction execution is suppressed for the following kinds of exception conditions, so that 
there is no possibility that one of the operands is lost: 

• Enabled illegal floating-point operation 

• Enabled zero divide 

For the remaining kinds of exception conditions, a result is generated and written to the 
destination specified by the instruction causing the exception. The result may be a different 
value for the enabled and disabled conditions for some of these exception conditions. The 
kinds of exception conditions that deliver a result are the following: 

Disabled illegcil floating-point operation 

Disabled zero divide 

Disabled overflow 

Disabled underflow 

Disabled inexact 

Enabled overflow 

Enabled underflow 

Enabled inexact 

Subsequent sections define each of the floating-point exception conditions and specify the 
action taken when they are detected. 

The IEEE standard specifies the handling of exception conditions in terms of traps and trap 
handlers. In the PowerPC architecture, setting an FPSCR exception enable bit causes 
generation of the result value specified in the IEEE standard for the trap enabled case — the 
expectation is that the exception is detected by software, which will revise the result. An 
FPSCR exception enable bit of causes generation of the default result value specified for 
the trap disabled (or no trap occurs or trap is not implemented) case — the expectation is that 
the exception will not be detected by software, which will simply use the default result. The 
result to be delivered in each case for each exception is described in the following sections. 

The IEEE default behavior when an exception occurs, which is to generate a default value 
and not to notify software, is obtained by clearing all FPSCR exception enable bits and 
using ignore exceptions mode (see Table 5-19). In this case the system floating-point 
enabled exception error handler is not invoked, even if floating-point exceptions occur. If 
necessary, software can inspect the FPSCR exception bits to determine whether exceptions 
have occurred. 

If the program exception handler notifies software that a given exception condition has 
occurred, the corresponding FPSCR exception enable bit must be set and a mode other than 
ignore exceptions mode must be used. In this case the system floating-point enabled 
exception error handler is invoked if an enabled floating-point exception condition occurs. 

Whether and how the system floating-point enabled exception error handler is invoked if 
an enabled floating-point exception occurs is controlled by MSR bits FEO and FEl as 
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shown in Table 5-19. (The system floating-point enabled exception error handler is never 
invoked because of a disabled floating-point exception.) 

Table 5-19. MSR[FEO] and MSR[FE1] Bit Settings 



FEO 


FE1 


Description 








Ignore exceptions mode— Floating-point exceptions do not cause the program exception error 
handler to be invoked. 





1 


Imprecise nonrecoverable mode— This mode is not applicable to the MPC601. FEO and FE1 or 
ORed, so setting either bit results in running the processor in precise mode. Note that in PowerPC 
processors that support this mode, the system floating-point enabled exception error handler is 
invoked at some point at or beyond the instruction that caused the enabled exception. The state of 
the processor may include conditions and data affected by the exception (that is, hazards are not 
avoided). It may not be possible to identify the excepting instruction or the data that caused the 
exception (that is, the data is not recoverable). 


1 





Imprecise recoverable mode — ^This mode is not applicable to the N/IPC601 . FEO and FE1 or ORed, so 
setting either bit results in running the processor in precise mode. Note that in PowerPC processors 
that support this mode, the system floating-point enabled exception error handler is invoked at some 
point at or beyond the instruction that caused the enabled exception. Sufficient information is 
provided to the system floating-point enabled exception error handler that it can identify the excepting 
instruction and the operands, and correct the result. All hazards caused by the exception are avoided 
(for example, use of the data that would have been produced by the excepting instruction). 


1 


1 


Precise mode— The system floating-point enabled exception error handler is invoked precisely at the 
instruction that caused the enabled exception. 



Note that in the MPC601, FEO and FEl are ORed; therefore, unless both FEO and FEl are 
cleared, the MPC601 operates in precise mode. Whether a floating-point result is stored and 
what value is stored is determined by the FPSCR exception enable bits, as described in 
subsequent sections, and are not affected by any MSR bit settings. 

Whenever the system floating-point enabled exception error handler is invoked, the 
microprocessor ensures that all instructions logically residing before the excepting 
instruction have completed, and no instruction after that instruction has been executed. 

If exceptions are ignored, an FPSCR instruction can be used to force any exceptions, due 
to instructions initiated before the FPSCR instruction, to be recorded in the FPSCR. A sync 
instruction can also be used to force exceptions, but is likely to degrade performance more 
than an FPSCR instruction. 



5-38 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 



For the best performance across the widest range of implementations, the following 
guidelines should be considered: 

• If the IEEE default results are acceptable to the application, FEO and FEl should be 
cleared (ignore exceptions mode). All FPSCR exception enable bits should be 
cleared. 

• Ignore exceptions mode should not, in general, be used when any FPSCR exception 
enable bits are set. 

• Precise mode may degrade performance in some implementations, perhaps 
substantially, and therefore should be used only for debugging and other specialized 
applications. 

5.4.7.2 Invalid Operation Exception Conditions 

An invalid operation exception occurs when an operand is invalid for the specified 
operation. The invalid operations are as follows: 

• Any operation except load, store, move, select, or mtfsf on a signaling NaN (SNaN) 

• For add or subtract operations, magnitude subtraction of infinities (oo.oo) 

• Division of infinity by infinity (oo/oo) 

• Division of zero by zero (0/0) 

• Multiplication of infinity by zero (oo*0) 

• Ordered comparison involving a NaN (invalid compare) 

• Square root or reciprocal square root of a negative, non-zero number (invalid square 
root) 

• Integer convert involving a number that is too large to be represented in the format, 
an infinity, or a NaN (invalid integer convert) 

FPSCR[VXSOFT] allows software to cause an invalid operation exception for a condition 
that is not necessarily associated with the execution of a floating-point instruction. For 
example, it might be set by a program that computes a square root if the source operand is 
negative. This facilitates the emulation of PowerPC instructions not implemented in the 
MPC601. 
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5.4.7.2.1 Action for Invalid Operation Exception Conditions 

The action to be taken depends on the setting of the invalid operation exception enable bit 
of the FPSCR. When invalid operation exception is enabled (FPSCR[VE]=1 ) and invalid 
operation occurs or software explicitly requests the exception, the following actions are 
taken: 

• One or two invalid operation exceptions is set 
FPSCR[VXSNAN](if SNaN) 
FPSCR[VXISI](ifoo-oo) 
FPSCR[VXIDI](ifoo/oo) 
FPSCR[VXZDZ](ifO/0) 
FPSCR[VXIMZ](if 00*0) 
FPSCR[VXVC](if invalid comparison) 
FPSCR[VXSOFT](if software request) 
FPSCR[VXCVI](if invalid integer convert) 

• If the operation is an arithmetic or convert-to-integer operation, 
the target FPR is unchanged 

FPSCR[FR FI] are cleared 
FPSCR[FPRF] is unchanged 

• If the operation is a compare, 
FPSCR[FR FI C] are unchanged 
FPSCR[FPCC] is set to reflect unordered 

• If software explicitly requests the exception, 

FPSCR[FR FI FPRF] are as set by the mtfsfi, mtfsf, or mtfsbl instruction 

When invalid operation exception condition is disabled (FPSCRVE=()) and invalid 
operation occurs or software explicitly requests the exception, the following actions are 
taken: 

• One or two invalid operation exception condition bits is set 
FPSCR[VXSNAN](if SNaN) 
FPSCR[VXISI](ifoo-oo) 
FPSCR[VXIDI](ifoo/oo) 
FPSCR[VXZDZ](ifO/()) 

FPSCR[VXIMZ](if 00*0) 
FPSCR[VXVC](if invalid comparison) 
FPSCR[VXSOFT](if software request) 
FPSCR[VXCVI](if invalid integer convert) 

• If the operation is an arithmetic operation, the target FPR is set to a quiet NaN 
FPSCR[FR FI] are cleared 

FPSCR[FPRF] is set to indicate the class of the result (quiet NaN) 
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• If the operation is a convert to 32-bit integer operation, tiie target FPR is set as 
follows: 

FRT[()-31] = undefined 

FRT[32-631 = most negative 32-bit integer FPSCR[FR FI] are cleared 

FPSCR[FPRF] is undefined 

• If the operation is a convert to 64-bit integer operation, the target FPR is set as 
follows: 

FRT[()-63] = most negative 64-bit integer 
FPSCR[FR FI] are cleared 
FPSCR[FPRF] is undefined 

• If the operation is a compare, 
FPSCRpR FI c are unchanged 
FPSCR[FPCC] is set to reflect unordered 

• If software explicitly requests the exception, 

FPSCR[FR FI FPRF] are as set by the mtfsfi, mtfsf, or mtfsbl instruction 

5.4.7.3 Zero Divide Exception Condition 

A zero divide exception condition occurs when a divide instruction is executed with a zero 
divisor value and a finite, non-zero dividend value. 

The name is a misnomer used for historical reasons. The proper name for this exception 
condition should be exact infinite result from finite operands exception condition 
corresponding to a mathematical pole. 

5.4.7.3.1 Action for Zero Divide Exception Condition 

The action to be taken depends on the setting of the zero divide exception condition enable 
bit of the FPSCR, When the zero divide exception condition is enabled (FPSCR[ZE]=1) 
and a zero divide condition occurs, the following actions are taken: 

• Zero divide exception condition bit is set 
FPSCR[ZX] = 1 

• The target FPR is unchanged 

• FPSCR[FR FI] are cleared 

• FPSCR[FPRF] is unchanged 

When zero divide exception condition is disabled (FPSCR[ZE]=()) and zero divide occurs, 
the following actions are taken: 

• Zero divide exception condition bit is set 
FPSCR[ZX] = 1 

• The target FPR is set to a ±infinity, where the sign is determined by the 
XOR of the signs of the operands 

• FPSCR[FR FI] are cleared 

• FPSCR[FPRF] is set to indicate the class and sign of the result (Idnfinity) 
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5.4.7.4 Overflow Exception Condition 

Overflow occurs when the magnitude of what would have been the rounded result if the 
exponent range were unbounded exceeds that of the largest finite number of the specified 
result precision. 

5.4.7.4.1 Action for Overflow Exception Condition 

The action to be taken depends on the setting of the overflow exception condition enable 
bit of the FPSCR. When the overflow exception condition is enabled (FPSCR[0E]=1) and 
an exponent overflow condition occurs, the following actions are taken: 

• Overflow exception condition bit is set 
FPSCR[OX] = 1 

• For double-precision arithmetic instructions, the exponent of the normalized 
intermediate result is adjusted by subtracting 1536 

• For single-precision arithmetic instructions and the floating round to single- 
precision instruction, the exponent of the normalized intermediate result is adjusted 
by subtracting 1 92 

• The adjusted rounded result is placed into the target FPR 

• FPSCR[FPRF] is set to indicate the class and sign of the result (±normal number) 

When the overflow exception condition is disabled (FPSCR[OE]=0) and an overflow 
condition occurs, the following actions are taken: 

• Overflow exception condition bit is set 
FPSCR[0X] = 1 

• Inexact exception condition bit is set 
FPSCR[XX] = 1 

• The result is determined by the rounding mode (FPSCR[RN]) and the sign of the 
intermediate result as follows: 

— Round to nearest 

Store ± infinity, where the sign is the sign of the intermediate result 

— Round toward zero 

Store the format's largest finite number with the sign of the intermediate result 

— Round toward +infinity 

For negative overflows, store the format's most negative finite number; for 
positive overflows, store +infinity 

— Round toward -infinity 

For negative overflows, store -infinity; for positive overflows, store the format's 
largest finite number 
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• The result is placed into the target FPR 

• FPSCR[FR FI] are cleared 

• FPSCR[FPRF] is set to indicate the class and sign of the result (±infinity 
or ±normal number) 

5.4.7.5 Underflow Exception Condition 

The underflow exception condition is defined separately for the enabled and disabled states: 

• Enabled — Underflow occurs when the intermediate result is "Tiny." 

• Disabled — Underflow occurs when the intermediate result is "Tiny" and there is 
"Loss of Accuracy." 

A "Tiny" result is detected before rounding, when a non-zero result value computed as 
though the exponent range were unbounded would be less in magnitude than the smallest 
nomialized number. 

If the intemiediate result is "tiny" and the underflow exception condition enable bit is 
cleared (FPSCR[UE]=0), the intermediate result is denormalized (see Section 2.4.9.4, 
"Normalization and Denormalization") and rounded (see Section 2.4.9.6, "Rounding"). 

"Loss of Accuracy" is detected when the delivered result value differs from what would 
have been computed were both the exponent range and precision unbounded. 

5.4.7.5.1 Action for Underflow Exception Condition 

The action to be taken depends on the setting of the underflow exception condition enable 
bit of the FPSCR. 

When the underflow exception condition is enabled (FPSCR[UE]=1) and an exponent 
underflow condition occurs, the following actions are taken: 

Underflow exception condition bit is set 
FPSCR[UX] = 1 

enable For double-precision arithmetic and conversion instructions, the exponent of 
the normalized intermediate result is adjusted by adding 1536 

For single-precision arithmetic instructions and the floating round to single- 
precision instruction, the exponent of the normalized intemnediate result is adjusted 
by adding 192 

The adjusted rounded result is placed into the target FPR 

FPSCR[FPRF] is set to indicate the class and sign of the result (dhnormalized 
number) 

The H^ and FI bits in the FPSCR allow the system floating-point enabled exception error 
handler, when invoked because of an underflow exception condition, to simulate a trap 
disabled environment. That is, the FR and FI bits allow the system floating-point enabled 
exception error handler to unround the result, thus allowing the result to be denormalized. 
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When the underflow exception condition is disabled (FPSCR[UE]=0) and an underflow 
condition occurs, the following actions are taken: 

• Underflow exception condition enable bit is set 
FPSCR[UX] = 1 

• The rounded result is placed into the target FPR 

• FPSCR[FPRF] is set to indicate the class and sign of the result 
(±denormalized number or ±zero) 

5.4.7.6 Inexact Exception Condition 

The inexact exception condition occurs when one of two conditions occur during rounding: 

• The rounded result differs from the intermediate result assuming the intermediate 
result exponent range and precision to be unbounded. 

• The rounded result overflows and overflow exception condition is disabled. 

5.4.7.6.1 Action for inexact Exception Condition 

The action to be taken does not depend on the setting of the inexact exception condition 
enable bit of the FPSCR. 

When the inexact exception condition occurs, the following actions are taken: 

• Inexact exception condition enable bit in the FPSCR is set 
FPSCR[XX] = 1 

• The rounded or overflowed result is placed into the target FPR 

• FPSCR[FPRF] is set to indicate the class and sign of the result 

In other PowerPC implementations, enabling inexact exception conditions may have 
greater latency than enabling other types of floating-point exception condition. 

5.4.8 Floating-Point Unavailable Exception (x'00800') 

A floating-point unavailable exception occurs when no higher priority exception exists, an 
attempt is made to execute a floating-point instruction (including floating-point load, store, 
and move instructions), and the floating-point available bit in the MSR is disabled, 
(MSR[FP]=0). 

The register settings for floating-point unavailable exceptions are shown in Table 5-20. 
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Table 5-20. Floating-Point Unavailable Exception— Register Settings 



Register 



SRRO 



SRR1 



MSR 



Setting Description 



Set to the effective address of the instruction that caused the exception. 



0-1 5 Cleared 

16-31 Loaded from bits 16-31 of the fvlSR 



EE 





SE 





PR 





FE1 





FP1 





EP 


Value is not altered 


ME 


Value is not altered 


IT 





FEO 





DT 






When a floating-point unavailable exception is taken, instruction execution resumes at 
offset x'{){)8()()' from the physical base address indicated by MSR[EP]. 

5.4.9 Decrementer Exception (x'00900') 

A decrementer exception occurs when no higher priority exception exists, the decrementer 
register has completed decrementing, and MSR[EE]=1 . The decrementer exception request 
is canceled when the exception is handled. The decrementer register counts down, causing 
an exception (unless masked) when passing through zero. The decrementer implementation 
meets the following requirements: 

• The operation of the RTC and the decrementer are coherent; that is, the counters are 
driven by the same fundamental time base (7.8125 MHz). 

• Loading a GPR from the decrementer does not affect the decrementer. 

• Storing a GPR value to the decrementer replaces the value in the decrementer with 
the value in the GPR. 

• Whenever bit of the decrementer changes from to 1 , an exception request is 
signaled. If multiple decrementer exception requests are received before the first can 
be reported, only one exception is reported. The occurrence of a decrementer 
exception cancels the request. 

• If the decrementer is altered by software and if bit is changed from to 1 , an 
interrupt request is signaled. 

The register settings for the decrementer exception are shown in Table 5-21 . 
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Table 5-21 . Decrementer Exception — Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of tlie instruction that the processor would have attempted to execute next 
if no exception conditions were present. 


SRR1 


0-1 5 Cleared 

16-31 Loaded from bits 16-31 of the MSR 


MSR 


EE SE 

PR FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 



When a decrementer exception is taken, instruction execution resumes at offset x'()()9(K)' 
from the physical base address indicated by MSR[EP]. 

5.4.10 I/O Controller Interface Error Exception (x'OOAOO') 

An I/O controller interface error exception occurs when no higher-order priority exists and 
a load or store corresponding to an I/O controller interface segment generates an error. I/O 
controller interface operations are described in Section 9.6, "I/O Controller Interface 
Operation." 

This exception is taken only when an operation to an I/O controller interface segment fails 
(such a failure is indicated to the MPC6()1 by a particular bus reply packet). If an I/O 
controller interface error exception occurs, the SRRO contains the address of the instruction 
following the excepting instruction. Note that illegal accesses to I/O controller interface 
space cause alignment or data access exceptions. For information refer to Section 5.4.6.1 .2, 
"I/O Controller Interface Access." Note that this exception is specific to the MPC6()1 . The 
PowerPC architecture treats I/O controller interface exceptions as data access exceptions 
(x'(:X)3(X)'). 

The register settings for I/O controller interface error exceptions are shown in Table 5-22. 
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Table 5-22. I/O Controller Interface Error Exception— Register Settings 



Register 



SRRO 



SRR1 



MSR 



DAR 



Setting Description 



Set to the effective address of the instruction following the instruction that caused the instruction. The 
addressed instruction has not been executed. SRRO contains the EA of the instruction following the 
load or store that caused the exception. 



0-15 Cleared 

16-31 Loaded from bits 16-31 of the MSR 



EE 





SE 





PR 





FE1 





FP1 





EP 


Value is not altered 


(^E 


Value is not altered 


IT 





FEO 





DT 






Set to the EA generated for the access that caused the exception 



When an I/O controller interface error exception is taken, instruction execution resumes at 
offset x'OOAOO' from the physical base address indicated by MSR[EP]. 

5.4.11 System Call Exception (x'OOCOO') 

A system call exception occurs when a System Call (sc) instruction is executed. The 
effective address of the instruction following the sc instruction is placed into SRRO. Bits 
16-31 of the MSR are placed into bits 16-31 of SRRl, and bits ()-15 of SRRl are set to 
undefined values. Then a system call exception is generated. 

The system call exception causes the next instruction to be fetched from offset x'(X)C(X)' 
from the physical base address indicated by the new setting of MSR[IP]. This instruction is 
context synchronizing. That is, when a system call exception occurs, instruction dispatch is 
halted and the following synchronization is performed: 

1. The exception mechanism waits for all instructions in execution to complete to a 
point where they report all exceptions they will cause. 

2. The processor ensures that all instructions in execution complete in the context in 
which they began execution. 

3. Instructions dispatched after the exception is processed are fetched and executed in 
the context established by the exception mechanism. 

Register settings are shown in Table 5-23. 
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Table 5-23. System Call Exception— Register Settings 



Register 


Setting Description 


SRRO 


Set to the effective address of the instruction following the System Call instruction 


SRR1 


0-1 5 Loaded from bits 1 6-31 of the instruction 
16-31 Loaded from bits 16-31 of the MSR 


MSR 


EE SE 

PR. FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 



When a system call exception is taken, instruction execution resumes at offset x'(K)C()0' 
from the physical base address indicated by MSR[EP]. 

5.4.12 Run Mode Exception (x'02000') 

The MPC601 defines an implementation-specific exception called the run mode exception. 
This exception is taken by the MPC601 under the following circumstances: 

• Instruction address compare 

• Branch target address compare 

• Trace mode (MSR[SE] is set) — Note that other PowerPC processors implement a 
separate trace exception at vector x'OODOO'. 

Note that this exception may hot be implemented by other PowerPC processors, and that 
this exception can be enabled and disabled using bits 8 and 9 in HIDl; the exception is 
enabled when HIDl [8,9] = b'Ol'. When this exception occurs, the registers are set as 
indicated in Table 5-24. 

Table 5-24. Run Mode Exception— Register Settings 



Register 


Setting 


SRRO 


Set to the address of the instruction that causes the run mode exception 


SRR1 


Loaded from bits 0-31 of the MSR. 


MSR 


EE SE 

PR FE1 

FP1 EP Value is not altered 

ME Value is not altered IT 

FEO DT 



When a run mode exception is taken, instruction execution resumes as offset x'0200()' from 
the base address indicated by MSR[EP]. 
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Chapter 6 

Memory Management Unit 

This chapter describes the MPC6()rs memory management unit (MMU). The primary 
functions of the MMU are to translate logical (effective) addresses to physical addresses 
for memory accesses, I/O accesses (most I/O accesses are assumed to be memory-mapped), 
and I/O controller interface accesses, and to provide access protection on a block or page 
basis. 

There are three types of accesses generated by the MPC601 that require address translation: 
instruction accesses, data accesses to memory generated by load and store instructions, and 
I/O controller interface accesses generated by load and store instructions. 

The MPC601 MMU provides 4-Gbytes of logical address space accessible to supervisor 
and user programs with a 4-Kbyte page size and 256-Mbyte segment size. Block sizes 
range from 128 Kbyte to 8 Mbyte and are software selectable. In addition, the MPC601 
uses an interim 52-bit virtual address and hashed page tables in the generation of 32-bit 
physical addresses. 

The MMU contains three translation lookaside buffers (TLBs). There is a 256-entry, two- 
way set-associative unified (instruction and data address) TLB (UTLB) for storing 
recently-used address translations, and a four-entry fully-associative first-level instruction 
TLB (ITLB) that is used only by instruction accesses for storing recently used instruction 
address translations. Additionally, there is a four-entry block TLB (BTLB) that stores the 
available block address translations (for instruction or data addresses). BTLB entries are 
implemented as the block address translation (BAT) registers that are accessible as 
supervisor special-purpose registers (SPRs). UTLB entries are generated automatically by 
the MPC601 hardware via a search of the page tables in memory. The MPC601 maintains 
all the segment information on-chip in 16 supervisor-level segment registers. 

This chapter describes the MMU address translation mechanisms, the MMU conditions 
that cause MPC601 exceptions, the insti^uctions used to program the MMU, and the 
corresponding registers. 

The MPC601 MMU relies on the exception processing mechanism for the implementation 
of the paged virtual memory environment and for enforcing protection of designated 
memory areas. Exception processing is described in Chapters, "Exceptions." 
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Section 2.3.1, "Machine State Register (MSR)," describes the MSR of the MPC601, which 
controls some of the critical functionahty of the MMU. 

The operation of the MPC601MMU conforms to the operating environment defined by the 
PowerPC architecture for 32-bit implementations in most respects. However, the number 
and format of the BAT registers is different, as is the available range of block sizes. In 
addition, the PowerPC architecture defines the concept of guarded memory that is not 
implemented in the MPC601 . Also, some MMU instructions of the PowerPC architecture 
(including tlbsync) are not implemented in the MPC6()1 . 

Note that the memory-forced I/O controller interface functionality described for the 
MPC601 is not defined as part of the PowerPC architecture, and will not be present in other 
PowerPC processors. Also note that the hardware implementation details of the MPC601 
MMU are not contained in the architectural definition of PowerPC processors and are 
invisible to the programming model. 

6.1 MMU Overview 

The MPC601 MMU and exception model support demand paged virtual memory. Virtual 
memory management permits execution of programs larger than the size of physical 
memory; demand paged implies that individual pages are loaded into physical memory 
from backing storage only when they are first accessed by an executing program. 

The memory management model of the MPC601 includes the concept of a virtual address 
that is not only larger than that of the maximum physical memory allowed but a virtual 
address space that is also larger than the logical address space. Each logical address 
generated by the MPC6()1 is 32 bits wide. In the address translation process, a logical 
address is converted to a 52-bit virtual address (as governed by the operating system) and 
then translated back to a 32-bit physical address. 

The operating system is responsible for managing the system's physical memory resources. 
Consequently, the operating system programs the MMU registers (segment registers, BAT 
registers, and table search descriptor register 1 (SDRl)) and sets up the page tables in 
memory appropriately. The MMU then assists the operating system by managing page 
status and maintaining the recently-used address translations on-chip for quick access. 

The MPC6()1 logical address spaces are divided into 256-Mbyte regions called segments 
or other large regions called blocks (128 Kbyte-8 Mbyte). Segments that correspond to 
memory or memory-mapped devices can be further subdivided into smaller regions called 
pages (4 Kbyte). For each block or page, the operating system creates an address descriptor 
(page table entry (PTE) or BTLB entry) that the MMU uses to generate the physical address 
and the protection and other access control infomnation when an address within the block 
or page is accessed. Address descriptors for pages reside in tables (as PTEs) in the physical 
memory; for faster accesses, the MMU maintains on-chip copies of recently used PTEs in 
the ITLB and UTLB, and keeps the block information on-chip in the BTLB (comprised of 
the BAT registers). 
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This section provides an overview of the high-level organization and operational concepts 
of the MPC601 MMU, and a summary of all MMU control registers. Section 2.3.3.5, 
"Table Search Descriptor Register 1 (SDRl)," describes the SDRl register, Section 2.3.2, 
"Segment Registers," describes the segment registers, and Section 2.3.1, "Machine State 
Register (MSR)," describes the MSR, and Section 2.3.3.1 1, "BAT Registers," describes the 
BAT registers. 

6.1.1 Memory Addressing 

A program references memory using the effective address computed by the processor when 
it executes a load, store, branch, or cache instruction, and when it fetches the next 
instruction. The effective (logical) address is translated to a physical address according to 
the procedures described throughout this chapter. The memory subsystem uses the physical 
address for the access. 

For a complete discussion of effective address calculation, see Section 3.1.1, "Effective 
Address Calculation." 

6.1.2 MMU Organization 

Figure 6- 1 shows the conceptual organization of the MMU and its relationship to some of 
the other functional units in the MPC601. The instruction unit generates all instruction 
addresses; these addresses are both for sequential instruction prefetches and addresses that 
correspond to a change of program flow. The integer unit generates addresses for data 
accesses (both for memory and the I/O controller interface). 

After an address is generated, the upper order bits of the logical address, LAO-LAI 9 (or a 
smaller set of address bits, LAO-LAn, in the cases of blocks), are translated by the MMU 
into physical address bits PA()-PA19. Simultaneously, the lower order address bits, A2{)- 
A31 (that are untranslated and therefore considered both logical and physical), are directed 
to the on-chip cache where they form the index into the eight-way set-associative tag array. 
After translating the address, the MMU passes the higher-order bits of the physical address 
to the cache, and the cache lookup completes. For cache-inhibited accesses or accesses that 
miss in the cache, the untranslated lower order address bits are concatenated with the 
translated higher-order address bits; the resulting 32-bit physical address is then used by the 
memory unit and the system interface, which accesses external memory. 

In addition to the upper-order address bits, the MMU automatically keeps an internal 
indicator of whether each access was generated as an instruction or data access and a 
supervisor/user indicator that reflects the state of the PR bit of the MSR when the logical 
address was generated. In addition, for data accesses, there is an indicator of whether the 
access is for a load or a store operation. This information is then used by the MMU to 
appropriately direct the address translation and to enforce the protection hierarchy 
programmed by the operating system. See Section 2.3.1, "Machine State Register (MSR)," 
for more information about the MSR. 
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For instruction accesses, the MMU first performs a lookup in the four entries of the ITLB 
for the physical address translation. Instruction accesses that miss in the ITLB and data 
accesses cause a lookup in the UTLB and BTLB for the physical address translation. In 
most cases, the physical address translation resides in one of the TLBs and the physical 
address bits are readily available to the on-chip cache. In the case where the physical 
address translation misses in the TLBs, the MPC601 automatically performs a search of the 
translation tables in memory using the information in the SDRl and the corresponding 
segment register. 

6.1.3 Address Translation Mechanisms 

The MPC601 supports the following four main types of address translation: 

• Page address translation — translates the page frame address for a 4-Kbyte page size 

• Block address translation — translates the block number for blocks that range in size 
from 128 Kbyte to 8 Mbyte 

• I/O controller interface address translation — used to generate I/O controller 
interface accesses on the external bus 

• Direct address translation — when address translation is disabled, the physical 
address is identical to the logical address 

Figure 6-2 shows the four main address translation mechanisms provided by the MPC601. 
The segment registers shown in the figure control both the page and I/O controller interface 
address translation mechanisms. When an access uses the page or I/O controller interface 
address translation, one of the 16 on-chip segment registers is selected by the highest-order 
logical address bits. A control bit in the corresponding segment register then determines if 
the access is to memory (memory-mapped) or to the I/O controller interface space. 

For memory accesses selected by the segment register, the segment register information is 
used to generate the interim 52-bit virtual address. Page address translation corresponds to 
the conversion of this virtual address into the 32-bit physical address used by the cache or 
by external memory. In most cases, the physical address for the page resides in the UTLB 
and is available for quick access. However, if the page address translation misses in the 
UTLB, the MPC6()1 automatically searches the page tables in memory (using the virtual 
address information and a hashing function) to locate the required physical address. 

Block address translation occurs in parallel with page and I/O controller interface address 
translation and is similar to page address translation, except that there are fewer upper-order 
logical address bits to be translated into physical address bits (more lower-order address 
bits (at least 17) are untranslated to form the offset into a block). Also, instead of segment 
registers and a UTLB, block address translations use the on-chip BAT registers as a BTLB. 
If the logical address of an access matches the corresponding field of a BAT register, the 
information in the BAT register is used to generate the physical address; in this case, the 
results of the page translation (occurring in parallel) are ignored. 
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Figure 6-2. Address Translation Types 

I/O controller interface address translation is enabled when the I/O controller interface 
translation control bit (T-bit) in the selected segment register (segment register selected by 
the highest-order address bits) is set. In this case, the remaining information in the segment 
register is interpreted as identifier information that is used with the remaining logical 
address bits to generate the packets used in an I/O controller interface access on the external 
interface; additionally, no UTLB lookup or page table search is performed and the BTLB 
lookup results are ignored. For more information about the I/O controller interface 
operations, see Section 9.6, "I/O Controller Interface Operation." 

A special case of I/O controller interface address translation (not shown in Figure 6-2) is 
supported that forces an I/O controller interface address translation to be interpreted as a 
memory access (that is, it uses the usual memory access protocol rather than the I/O 
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controller interface access protocol on the external interface). This occurs when a field in 
the selected segment register (with the T-bit set) is encoded as memory-forced I/O 
controller space. This feature effectively allows the specification of a 256-Mbyte "block" 
of memory (with a common physical block number) with the use of only one segment 
register, bypassing the page and block address translation and protection mechanisms 
described above. 

Direct address translation occurs when address translation is disabled; in this case the 
physical address generated is identical to the logical address. The translation of addresses 
for instruction and data accesses is enabled (and disabled) independently with the MSR[IT] 
and MSR[DT] bits, respectively. Thus when the instruction unit generates an instruction 
access, and instruction address translation is disabled (MSR[IT] = 0), the resulting physical 
address is identical to the logical address and all other translation mechanisms are ignored. 

When a data access occurs and MSR[DT] = 0, the resulting physical address is identical to 
the logical address with one exception — I/O controller interface address translation for data 
accesses is allowed, even when MSR[DT] = 0. In this case, the segment registers are used 
in the same way as if translation were enabled. Note that this case of data accesses to the 
I/O controller interface while MSR[DT] = will not be supported in other PowerPC 
processors. 

6.1.4 Memory Protection Facilities 

In addition to the translation of logical addresses to physical addresses, the MMU provides 
access protection of supervisor areas from user access and can designate areas of memory 
as read-only. Table 6-1 shows the four protection options supported by the MPC601. 

Table 6-1 . Access Protection Options 



Option 


User Read 


User Write 


Supervisor 
Read 


Supervisor 
Write 


Supervisor-only 


Not allowed 


Not allowed 


^ 


V 


Supervisor-write-only 


V 


Not allowed 


^ 


V 


Both user/supervisor 


V 


V 


^/ 


a/ 


Both read-only 


V 


Not allowed 


^ 


Not allowed 



Each of these options is enforced at the block or page level. Thus, the supervisor-only 
option allows only read and write operations generated while the MPC6()1 is operating in 
supervisor mode (corresponding to MSR[PR] = 0) to use the selected address translation 
(block or page). User accesses that map into these blocks or pages cause an exception to be 
taken. 

As shown in the table, the supervisor-write-only option allows both user and supervisor 
accesses to read from the selected area of memory but only supervisor programs can update 
(write to) that area. There is also an option that allows both supervisor and user programs 
read and write access (both user/supervisor option), and finally, there is an option to 
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designate an area of memory as read-only, both for user and supervisor programs (both 
read-only option). 

For I/O controller interface segments, the MMU calculates a "key" bit based on the 
protection values programmed in the segment register, and the specific user/supervisor and 
read/write infonnation for the particular access. However, this bit is merely passed on to 
the system interface to be transmitted in the context of the I/O controller interface protocol 
as described in Section 9.6, "I/O Controller Interface Operation." The MMU does not itself 
enforce any protection or cause any exception based on the state of the key bit for these 
accesses. The I/O controller device or other external hardware can optionally use this bit to 
enforce any protection required. 

6.1.5 Page History Information 

The MPC601 MMU also maintains reference (R) and change (C) bits in the page address 
translation mechanism that can be used as history information relevant to the page. This 
information can then be used by the operating system to determine which areas of memory 
to write back to disk when new pages must be allocated in main memory. While these bits 
are initially programmed by the operating system into the page table, the MPC601 
automatically updates these bits when required. Note that the updates to these bits in the 
page tables are performed with standard read and write transactions on the bus (not locked 
read-modify-write operations). However, when multiple MPC601 devices have shared 
access to the page tables, the bit settings are guaranteed to be updated correctly. 

6.1.6 General Flow of MMU Address Translation 

When an instruction or data access is generated and the corresponding instruction or data 
translation is disabled (MSR[IT] = or MSR[DT] = 0), direct translation is used and the 
access continues to the cache. When the selected segment register indicates that the access 
is an I/O controller interface access, I/O controller interface address translation occurs. See 
Section 6.5, "Selection of Address Translation Type" for more information regarding the 
selection of address translation mode used for all cases. 

For instruction accesses, if translation is enabled (MSR[IT] =1), the ITLB is first checked 
for a matching page or block address translation. If there is a miss, then the MMU uses the 
block and page address translation mechanisms to find the address translation. Figure 6-3 
shows the flow used to search for the block or page address translation. 

Although the MPC601 performs the block and page TLB lookups simultaneously, the flow 
diagram shows that if a BTLB hit occurs, that particular translation is performed regardless 
of the results of the UTLB lookup. If the BTLB misses, the results of the UTLB search are 
considered. If the UTLB hits, the page translation occurs and the physical address bits are 
forwarded to the cache (if the access is cacheable). 
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Figure 6-3. MMU Block and Page Address Translation Flow 

If the UTLB misses, the MPC601 automatically searches the page tables in memory. If the 
page table entry (PTE) is successfully read, a new UTLB entry (and an ITLB entry for the 
instruction access case) is created and the page translation is once again attempted. This 
time, the UTLB (and ITLB for instruction access case) is guaranteed to hit. If the PTE is 
not found by the table search operation, an instruction access or data access exception is 
generated. 
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Note that if either the BTLB or UTLB results in a hit, the access is qualified with the 
appropriate protection bits. If the access is determined to be protected (not allowed), an 
exception (instruction access or data access) is generated. 

6.1.7 Memory/MMU Coherency Model 

The memory model of the MPC6()1 provides the following features: 

• Performance benefits of weak ordering of memory accesses 

• Memory coherency among processors and between a processor and I/O devices 
controlled at the block and page level 

• Instructions that ensure a coherent and ordered memory state. 

• Processor address order guaranteed 

The memory implementations in MPC601 systems can take advantage of the performance 
benefits of weak ordering of memory accesses between processors or between processors 
and other external devices without any additional complications. The MMU assumes that 
all accesses are ordered. Thus, the priority of accesses is determined at the external 
interface in a way that provides maximum throughput for most cases. 

In addition, at the system level, the memory coherency among processors and between a 
processor and I/O devices is programmed through the following three mode control bits in 
the MMU: 

• Write-through (W bit) 

• Caching-inhibited (I bit) 

• Memory coherency (M bit) 

Both the block and page address translation mechanisms contain the WIM bits for each 
TLB entry; these bits are used to control all accesses that correspond to the particular block 
or page. The four possible combinations of the W and M bits yield modes that are supported 
for I = (caching-allowed) as shown in Table 6-2. For the caching-inhibited (1=1) case, 
there are only two modes defined, corresponding to W=()/M = 0, and W=0/M=1. 

Table 6-2. Defined WIM Combinations 
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The MPC6()1 also provides instructions (the cache instructions, isync, sync, eieio, Iwarx, 

and stwcx.) to ensure a coherent and ordered memory state. These instructions are 
described in Chapter 3, "Addressing Modes and Instruction Set Summary," and in Memory 
accesses performed by a single processor appear to complete sequentially from the view of 
the programming model but may complete out of order with respect to the ultimate 
destination in the memory hierarchy. Order is guaranteed at each level of the memory 
hierarchy for accesses to the same address from the same processor. 

Memory coherency can be enforced externally by a snooping bus design, a centralized 
cache directory design, or other design that can take advantage of these coherency features. 

6.1.8 Effects of Instruction Prefetch on MMU 

Speculative instruction execution occurs when the MPC601 executes instructions in 
advance in case the result is needed. If subsequent events indicate that the speculative 
instruction should not have been executed, the processor abandons the results produced by 
that instruction. Typically, the MPC6()1 executes instructions speculatively when it has 
resources that would otherwise be idle, so the operation is done at little or no cost. 

The MPC6()1 executes computational instructions speculatively (beyond a branch 
instruction) and performs instruction prefetches (it fetches instructions ahead in the 
instruction stream). However, the MPC6()1 does not execute any load or store instructions 
speculatively. Speculative execution of computational instructions does not involve the 
MMU. 

To avoid instruction fetch delay, the processor typically prefetches instructions. Such 
instruction prefetching is speculative in that prefetched instructions may not be executed 
due to intervening branches or exceptions. 

The following constraints are enforced for instruction prefetching: 

• Prefetching does not occur across a page boundary (4 Kbyte). The processor only 
fetches from a new page when it is certain that at least the first instruction to be 
fetched from the new page is required for execution by the program, 

• Prefetching from non-cacheable (1=1) memory does not occur, except when an 
instruction within the boundaries of a cache block (sector) is needed. In that case, 
the subsequent words in the cache block (sector) are prefetched. 

• Neither fetching nor prefetching from I/O controller interface segments (T=l) 
occurs except for instruction fetches (or prefetches) made from memory-forced 1/0 
controller interface segment. See 6.10, "I/O Controller Interface Address 
Translation" for more information about I/O controller interface segments. 

Machine check exceptions that result from instruction prefetching may be generated, even 
if the instruction fetched would not have been executed because of a previous branch or 
change in program flow. See Section 5.4.2, "Machine Check Exception (x'0()20()')," for 
more information on the machine check exception. 
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Memory in the MPC601 systems is considered not "guarded" in the sense that prefetching 
may occur to any area of memory. For example, if a data area is adjacent to an instruction 
area of memory, the MPC601 could prefetch from that data area. Furthermore, if a word in 
that data area contains the encoding for an unconditional branch instruction, the processor 
could even continue to prefetch from the address it interprets as the target of the branch. 
Care may be required to prevent these situations, particularly if peripheral devices that 
cannot recover from extraneous accesses reside in these areas. Areas of memory in other 
PowerPC processors may be designated as guarded within the MMU in that speculative 
operations do not occur. 

6.1.9 Breakpoint Facility 

Through the use of the HIDx registers (HIDl, HID2, and HID5), the MPC6()1 has the 
ability to perform a breakpoint operation for both instruction and data accesses 
independently. For instruction accesses, the logical addresses of instructions in decode are 
compared with the address specified in the instruction address breakpoint register (lABR). 
If there is a match, then the processor takes a run mode exception. Similarly, data 
breakpoints occur when the logical address of a data access matches the address specified 
in the data address breakpoint register (DABR) register. However, when a data address 
matches, the MPC601 takes a data access exception. 

The instruction and data breakpoint functionality is controlled by bit settings in the 
MPC601 debug modes register (HIDl). Various combinations and levels of breakpoints can 
be enabled. Section 2.3.3.12, "MPC601 Implementation-Specific HID Registers" describes 
the breakpoint functionality provided in the MPC601 . Note that these breakpoints occur 
completely independently of the MSR[DT] and MSR[IT] bit settings. 

6.1.10 l\/li\/IU Exceptions Summary 

In order to complete any memory access, the logical address must be translated to a 
physical address. An MMU exception condition occurs if this translation fails for one of the 
following reasons: 

• There is no valid entry in the page table for the page specified by the logical address 
(and segment register). 

• An address translation is found but the access is not allowed by the memory 
protection mechanism. 

Most MMU exception conditions cause either the instruction access exception or the data 
access exception to be taken. The state saved by the MPC601 for each of these exceptions 
contains information that identifies the address of the failing instruction. Refer to 
Chapter 5, "Exceptions," for a more detailed description of exception processing. 

There are 1 1 types of conditions that can cause an MMU exception to occur. The exception 
conditions map to the MPC6()1 exception as shown in Table 6-3. The only MMU exception 
condition recognized when MSR[IT]=0 is the instruction breakpoint match condition. The 
only exception conditions that occur when MSR[DT]=0 are the data breakpoint match 
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condition and the conditions tiiat cause the alignment exception for data accesses. For more 
detailed information about the conditions that cause the alignment exception (in particular 
for string/multiple instructions) see 5.4.6, "Alignment Exception (x'{)06(X)')." 



Table 6-3. MMU Exception Conditions/Exception Mapping 



Condition 


Description 


Exception 


Page fault 


No matching PTE found in page 
tables 


1 access: instruction access exception 
SRR1[1] = 1 


D access: data access exception 
DSISR[1]=1 


Block protection violation 


Conditions described in 
Table 6-7 for block 


1 access: instruction access exception 
SRR1[4] = 1 


D access: data access exception 
DSISR[4]=1 


Page protection violation 


Conditions described in 
Table 6-7 for page 


1 access: instruction access exception 
SRR1[4] = 1 


D access: data access exception 
DSISR[4]=1 


dcbz with W or 1 = 1 


dcbz instruction to write-through 
or cache-inhibited segment or 
block 


Alignment exception 


Instruction access to I/O 
controller interface space 


Attempt to fetch instruction when 
SRIT]=1.SR[BUID];t'07F' 


Instruction access exception 
Causes no SRR1 bits to be set 


Iwarx, stwcx., Iscbx instruction 
to I/O controller interface space 


Reservation instruction or load 
string and compare byte 
instruction when SR[T]=1 , 
SR[BUID]vt'07F' 


Data access exception 
DSISR[5] = 1 


Floating-point load or store to I/O 
controller interface space 


FP memory access when 
SR[T]=1,SR[BUID]/'07F' 


Alignment exception 


Instruction breakpoint match 


Instruction address matches the 
address in HID2 


Run mode exception 


Data breakpoint match 


Data address matches the 
address in HID5 


Data access exception 
DSISR[9] = 1 


Operand misalignment: 
256 f^^byte boundary 


Operand crosses a 256 f^byte 
boundary (regardless of 
MSR[Df] and MSR[IT] setting) 


Alignment exception 


Operand misalignment: 
4 Kbyte boundary 


Page translation (SR[T] = and 
no BTLB match), and operand 
crosses a 4 Kbyte boundary 


Alignment exception 
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6.1.11 MMU Instructions and Register Summary 

Table 6-4 summarizes the instructions of tiie MPC601 that specifically control the MMU. 
For more detailed information about the instructions, refer to Chapter 10, "Instruction Set." 

Table 6-4. Instruction Summary— Control MMU 



Instruction 


Description 


mtsr SR.rS 


Move to Segment Register 
SR[SR#]^ rS 


mtsrin rS.rB 


iVIove to Segment Register Indirect 
SR[rB[0-3]]<-rS 


mfsr rD.SR 


Move from Segment Register 
rD<-SR[SR#] 


mfsrin rD.rB 


Move from Segment Register Indirect 
rDf-SR[rB[0-3]] 


tibie rB 


Translation Lookaside Buffer Invalidate Entry 

If TLB hit (for logical address specified as rB), TLB[\/]<-0 

Causes TLB! operation on the system bus. 



Table 6-5 summarizes the registers that the operating system uses to program the MMU. 
These registers are accessible to supervisor-level software only. These registers are 
described in detail in Chapter 2, "Registers and Data Types." 

Table 6-5. MMU Registers 



Register 


Description 


Segment registers 
(SR0-SR15) 


The sixteen 32-bit segment registers are present only in 32-bit implementations of 
PowerPC. Figure 6-13 shows the format of a segment register. The fields in the 
segment register are interpreted differently depending on the value of bit 0. The 
segment registers are accessed by the MPC601 -specific mtsr, mtsrin, mfsr, and 
mfsrin instructions 


BAT registers 
(BAT0U-BAT3U and 
BAT0L-BAT3L) 


The MPC601 includes eight block-address translation registers (BATs), organized as 
four pairs (BAT0U-BAT3U and BAT0L-BAT3L). Figure 6-6 and Figure 6-7 show the 
format of the upper and lower BAT registers. These are special-purpose registers 
that are accessed by the mtspr and mfspr instructions. 


Table search descriptor 

register 1 

(SDR1) 


The 32-bit table search descriptor register 1 (SDR1) specifies the variables used in 
accessing the page tables in memory. This is a special-purpose register that is 
accessed by the mtspr and mfspr instructions. 



6.1.12 TLB Entry Invalidation 

The UTLB (and ITLB) maintains on-chip copies of the PTEs that are resident in physical 
memory. The MPC601 has the ability to invalidate resident UTLB entries through the use 
of the tlbie instruction. Additionally, the tIbie instruction also causes a TLB invalidate 
broadcast (an address-only operation) to occur on the system bus so that other processors 
also invalidate their resident copies of the matching PTE. See Chapter 10, "Instruction Set" 
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for detailed information about the tibie instruction and Section 9.3.2.2.1, "Transfer Type 
(TT{)-TT4) Signals" for more information on address-only bus transactions. 

The snooping hardware of the MPC601 detects when other processors perform a TLB 
invalidate broadcast on the bus. In the case of a hit with an on-chip UTLB entry, the 
MPC601 performs the following: 

1. Prevents execution of any new load, store, cache control or tlbie instructions and 
prevents any new reference or change bit updates 

2. Waits for completion of any outstanding memory operations (including updates to 
the reference and change bits associated with the entry to be invalidated) 

3. Invalidates the two entries (both associativity classes) in the UTLB indexed by the 
matching address 

4. Resumes normal execution 

6.2 ITLB Description 

The MPC6()1 implements a four-entry, fully-associative TLB for storing the most recently 
used instruction address translations. The MPC601 automatically generates an entry in the 
ITLB whenever the page or block address translation mechanism generates a new logical- 
to-physical mapping for a page or block used for instruction fetch. Each ITLB entry can 
contain the translation information for either an entire block or a page. The MPC601 uses 
the ITLB for address translation of instruction accesses when MSR[IT] = 1. 

The instruction unit accesses the ITLB independently of the rest of the MMU. Therefore, 
when instruction accesses hit in the ITLB, the page and block translation mechanisms are 
available for use by data accesses simultaneously. 

The MPC601 also automatically maintains the integrity of the entries in the ITLB by 
purging the contents when any of the following conditions occur: 

• An mtsr or mtsrin instruction is executed 

• An mtspr instruction that modifies any of the BAT registers is executed 

• A tlbie instruction is executed 

• A TLB invalidate operation is detected on the system interface (via snooping) 

Since these conditions potentially cause the MMU context to be changed, the ITLB entries 
may not longer be valid. Therefore, the MMU automatically detects these conditions and 
clears all the valid bits in the ITLB array. 

Finally, the MPC601 replaces ITLB entries on a least-recently-used (LRU) basis. 
Throughout the remainder of this chapter, the page and block translations that are resident 
in the ITLB are described within the context of page address translation and block address 
translation, as the contents of the ITLB are always a subset of translations that were 
generated for the UTLB and/or the BTLB. 
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Accesses to the ITLB are transparent to the executing program, except that hits in the ITLB 
contribute to a higher overall instruction throughput by allowing data translations to occur 
in parallel. 

6.3 Memory/Cache Access Modes 

All instruction and data accesses are performed under the control of the three mode control 
bits that are defined by the MMU for each access. The three mode control bits, W, I, and M, 
have the following effects. The W and I bits control how the processor performing the 
access uses its own cache. The M bit specifies whether the processor performing the access 
must use the memory coherency protocol to ensure that all copies of the addressed memory 
location are consistent. 

When an access requires coherency, the processor performing the access must inform the 
coherency mechanisms throughout the system that the access requires memory coherency. 
The M bit determines the kind of access performed on the bus (global or local). Note that 
these mode-control bits are relevant only when an address is translated and are not saved 
along with data in the on-chip cache (for cacheable accesses). Once an access has been 
translated, the MESI bits in the cache then control the coherency to that cache location 
made by subsequent accesses from other processors. See Chapter 4, "Cache and Memory 
Unit Operation," for more information about cache accesses. 

The operating system programs the WIM bits for each page or block as required. The WIM 
bits reside in the BAT registers for block address translation and in the PTEs for page 
address translation. Thus these bits are programmed as follows: 

• The operating system uses the mtspr instruction to program the WIM bits in the 
BAT registers for block address translation. 

• The operating system programs the WIM bits for each page into the PTEs in system 
memory as it sets up the page tables. 

Note that for accesses performed with direct address translation (MSR[IT]=() or 
MSR[DT]= for instruction or data access, respectively), the WIM bits are automatically 
generated as b'()()l ' (the data is write-back, caching is enabled, and memory coherency is 
enforced). 

6.3.1 Write-Through Bit (W) 

When an access is designated as write-through (W=l), if the data is in the cache, a store 
operation updates the cached copy of the data. In addition, the update is written to the 
external memory location (as described below). Store-combining compiler optimizations 
are allowed for write-through accesses except when the store instructions are separated by 
a sync instruction. Note that a store operation that uses the write-through mode may cause 
any part of valid data in the cache to be written back to main memory. 
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The definition of the external memory location to be written to in addition to the on-chip 
cache depends on the implementation of the memory system but can be illustrated by the 
following examples: 

• RAM — ^The store must be sent to the RAM controller to be written into the target 
RAM. 

• I/O device — ^The store must be sent to the I/O control hardware to be written to the 
target register or memory location. 

In systems with multilevel caching, the store must be written to at least a depth in the 
memory hierarchy that is seen by all processors and devices. 

Accesses that correspond to W=0 are considered write-back. For this case, although the 
store operation is performed to the cache, it is only made to external memory when a copy- 
back operation is required. Use of the write-back mode (W=0) can improve overall 
performance for areas of the memory space that are seldom referenced by other masters in 
the system. See Chapter 4, "Cache and Memory Unit Operation," for more information 
about cache accesses. 

6.3.2 Caching Inhibited Bit (I) 

If 1=1, the memory access is completed by referencing the location in main memory, 
completely bypassing the on-chip cache of the MPC601 . During the access, the accessed 
location is not loaded into the cache nor is the location allocated in the cache. If a copy of 
the accessed data is in the cache, that copy is not updated, flushed, or invalidated. Data 
accesses from more than one instruction may not be combined (as a compiler optimization) 
for cache-inhibited operations. 

6.3.3 l\/lemory Coherence Bit (!\/l) 

This mode control bit is provided to allow improved performance in systems where 
hardware-enforced coherency is relatively slow, and software is able to enforce the required 
coherency. When M=0, the MPC601 does not enforce data coherency. When M=l, the 
processor enforces data coherency and the corresponding access is considered to be a 
global access. When the M bit is set, and the access is performed to external memory, the 
GBL signal is asserted and the access is designated as global. Other processors affected by 
the access must then respond to this global access and signal whether it is shared. If the data 
in another processor is modified, then address retry is signaled. 
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6.3.4 W, I, and M Bit Combinations 

Table 6-6 summarizes the six combinations of the WIM bits defined for the MPC601 . 

Table 6-6. Combinations of W, I, and M Bits 



WIM Setting 


Meaning 


000 


Data may be cached. 

Loads or stores whose target hits in the cache use that entry in the cache. 

Exciusive ownership of the blocl< containing the target iocation is not required for store accesses 
and coherency operations for the blocl< do not occur when fetching the blocl<, storing it bacl<, or 
changing its state from shared to exclusive. 


001 


Data may be cached. 

Loads or stores whose target hits in the cache use that entry in the cache. 

Memory coherency is enforced by hardware as foilows: exciusive ownership of the b!ocl< 
containing the target location is required before store accesses are allowed. When fetching the 
block, the processor indicates on the bus transaction that coherency is to be enforced. If the state 
of the block is shared-unmodified, the processor must gain exclusive use of the block before 
storing into it. 

This encoding is used for addresses translated via direct address translation (N/ISR[IT] = or 
MSR[DT] = 0). 


010 


Caching is inhibited. 

The access is performed to external memory, completely bypassing the cache. 

Hardware enforced memory coherency is not required. 


011 


Caching is inhibited. 

The access is performed to external memory, completely bypassing the cache. 

Memory coherency must be enforced by external hardware (MPC601 asserts tiBL). 


100 


Data may be cached. 

Loads whose target hits in the cache use that entry in the cache. 

Stores are written to external memory. The target location of the store may be cached and is 
updated on a hit. 

Exclusive ownership of the block containing the target location is not required for store accesses 
and coherency operations for the block do not occur when fetching the block, storing it back, or 
changing its state from shared to exclusive. 


101 


Data may be cached. 

Loads whose target hits in the cache use that entry in the cache. 

Stores are written to external memory. The target location of the store may be cached and is 
updated on a hit. 

Memory coherency is enforced by hardware as follows: exclusive ownership of the block 
containing the target location is required before store accesses are allowed. When fetching the 
block, the processor indicates on the bus transaction that coherency is to be enforced. If the state 
of the block Is shared, the processor must gain exclusive use of the block before storing into it. 



If the system software maps the same physical page with multiple page table entries that 
have different W, I, or M values, the results of the translation are undefined. 
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6.4 General Memory Protection Mechanism 

Another aspect of the MMU that is programmed at the block and page level is the memory 
protection option. The memory protection mechanism allows selectively granting read 
access, granting read/write access, and prohibiting access to areas of memory based on a 
number of control criteria. 

The memory protection mechanism is used by both the block and page address translation 
mechanisms in a similar way, as described here. For specific information unique to block 
address translation, refer to Section 6.7.4, "Block Memory Protection." For specific 
information unique to page address translation, refer to Section 6.8.5, "Page Memory 
Protection." 

For both block and page address translation, the memory protection mechanism is 
controlled by the following: 

• MSR[PR], which defines the mode of the access as follows: 

— MSR[PR]=0 corresponds to supervisor mode 

— MSR[PR]=1 corresponds to user mode 

• Ks and Ku, the supervisor and user key bits, which define the key for the block or 
page 

• The PP bits, which define the access options for the block or page 

The key bits (Ks and Ku) and the PP bits are located as follows for block and page address 
translation: 

• Ks and Ku are located in the upper BAT register for block address translation and in 
the selected segment register for a page address translation. 

• The PP bits are located in the upper BAT register for block address translation and 
in the PTE for page address translation. 

The key bits, the PP bits, and the MSR[PR] bit are used as follows: 

• When an access is generated, one of the key bits (Ks or Ku) is selected to be the key 
as follows: 

— For supervisor accesses (MSR[PR]=0), the Ks bit is used and Ku is ignored 

— For user accesses (MSR[PR]=1), the Ku bit is used and Ks is ignored 

• The selected key is used with the PP bits to determine if the load or store access is 
allowed. 

Table 6-7 shows the types of accesses that are allowed for the general case (all possible Ks, 
Ku, and FP bit combinations). 
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Table 6-7. Access Protection Control with Key 



Keyi 


pp2 


Block Type 





00 


Read/write 





01 


Read/write 





10 


Read/write 





11 


Read only 


1 


00 


No access 


1 


01 


Read only 


1 


10 


Read/write 


1 


11 


Read only 



' Ks or Ku selected by state of MSR[PR] 
2 PP protection option bits in BTLB entry 
orPTE 



Thus, the conditions that cause a protection violation are depicted in Table 6-8. Any access 
attempted (read or write) when the key = 1 and PP = 00, results in a protection violation 
exception condition. When key = 1 and PP = 01 , an attempt to perform a write access causes 
a protection violation exception condition. When PP = 10, all accesses are allowed, and 
when PP = 11, write accesses always cause an exception. The MPC601 takes either the 
instruction access exception or the data access exception (for an instruction or data access, 
respectively) when there is an attempt to violate the memory protection. 

Table 6-8 . Exception Conditions for Key and PP Combinations 



Key 


PP 


Prohibited 
Accesses 


1 


00 


Read/write 


1 


01 


Write 


X 


10 


None 


X 


11 


Write 



Although any combination of the Ks, Ku and PP bits is allowed, the Ks and Ku bits can be 
programmed so that the value of the key bit for Table 6-7 directly matches the MSR[PR] 
bit for the access. In this case, the encoding of Ks=0 and Ku=l is used for the BTLB entry 
or the PTE, and the PP bits then enforce the protection options shown in Table 6-9. 
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Table 6-9. Access Protection Encoding of PP Bits 



pp 

Reld 


Option 


User Read 
(Key=1) 


User Write 
(Key=1) 


Supervisor 

Read 

(Key=0) 


Supervisor 

Write 

(Key=0) 


00 


Supervisor-only 


Not allowed 


Not allowed 


V 


^ 


01 


Supervisor-write-only 


^ 


Not allowed 


^/ 


^/ 


10 


Both user/supervisor 


■^ 


V 


^ 


^/ 


11 


Both read-only 


^ 


Not allowed 


V 


Not allowed 



However, if the setting Ks=l is used, supervisor accesses are treated as user reads and 
writes witii respect to Table 6-9. Likewise, if the setting Ku=0 is used, user accesses to the 
block or page are treated as supervisor accesses in relation to Table 6-9. Therefore, by 
modifying one of the key bits (in either the BAT register or the segment register), the way 
the MPC601 interprets accesses (supervisor or user) in a particular block or segment can 
easily be changed. Note, however, that only supervisor programs can modify the key bits 
for the block or the segment as access to the BAT registers and the segment registers is 
privileged. 

When the memory protection mechanism prohibits a reference, one of the following occurs, 
depending on the type of access that was attempted: 

• For data accesses, a data access exception is generated and bit 4 of DSISR is set. If 
the access is a store, bit 6 of DSISR is also set. 

• For instruction accesses, an instruction access exception is generated and bit 4 of 
SRRl is set. 

See Chapter 5, "Exceptions," for more information about these exceptions. 

6.5 Selection of Address Translation Type 

A description of the selection flow for determining the type of address translation to be 
perfonned is provided in Figure 6-4. The selection of address translation type differs for 
instruction and data accesses in that I/O controller interface accesses are not allowed for 
instruction accesses when instruction address translation is disabled, and I/O controller 
interface accesses for data occur without regard for the enabling of data address translation. 

6.5.1 Address Translation Selection for Instruction Accesses 

Addresses for instruction accesses are translated under control of the IT bit of MSR. When 
any context-synchronizing event occurs within the MPC601, any prefetched instructions 
are discarded and refetched using the updated state of MSR[IT]. 
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Figure 6-4. Address Translation Type Selection 

6.5.1.1 Instruction Address Translation Disabled: MSR[IT]=0 

When instruction address translation is disabled, designated by MSR[IT]=0, tiie logical 
address is interpreted as described in Section 6.6, "Direct Address Translation." 

6.5.1.2 Instruction Address Translation Enabled: MSR[IT]=1 

When instruction address translation is enabled (MSR[IT] = 1), instruction fetching occurs 
under control of one of the following address translation mechanisms: 

• Page address translation 

• Block address translation 

Note that for either of these translation mechanisms, the ITLB is first checked for the 
address translation. If the ITLB misses, then the corresponding segment register is accessed 
to see if the access is to the I/O controller interface space. If the access is not to the I/O 
controller interface space, the page and block address translation mechanisms are used as 
shown in Figure 6-3. 
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In most cases, instructions cannot be fetched from the I/O controller interface segments and 
attempting to execute an instruction fetched from an I/O controller interface segment 
causes an instruction access exception. However, instruction fetches are allowed when the 
address translation maps to segments with the T bit set (I/O controller interface segment) 
and with the memory-forced I/O controller interface encoding. This case is described in 
more detail in Section 6.10.4, "Memory-Forced I/O Controller Interface Accesses." 

6.5.2 Address Translation Selection for Data Accesses 

As shown in Figure 6-4, for data accesses, the corresponding segment register is selected 
independent of the DT bit of MSR. Addresses for data accesses are translated first under 
control of the T bit of the selected segment register. If T=l, the translation is to an I/O 
controller interface segment. Otherwise, the translation is governed by the state of the DT 
bit of MSR. When the state of MSR[DT] changes, subsequent accesses are made using the 
new state of MSR[DT]. 

6.5.2.1 I/O Controller Interface Address Translation: T=1 in Segment 
Register 

I/O controller interface segments are used independently of MSR[DT]. When the segment 
register indexed by the upper-order logical address bits has the T bit set, the access is 
considered an I/O controller interface access and the I/O controller interface protocol of the 
external interface is used to perform the access to I/O controller space. 

Note, however, that an x'OVF' encoding in the BUID field of the segment register defines 
an access as a memory-forced I/O controller interface access. In this case, the memory 
protocol is used on the external interface. See Section 6.10, "I/O Controller Interface 
Address Translation" for more information on address translation for I/O controller 
interface accesses. 

6.5.2.2 Data Translation Disabled: MSR[DT]=0 

When MSR[DT]=0, the logical address is interpreted as described in Section 6.6, "Direct 
Address Translation." Note that as shown in Figure 6-4, the determination of whether the 
address maps to an 1/0 controller interface segment occurs prior to the checking of 
MSR[DT]. Therefore, I/O controller interface address translation occurs independently of 
MSR[DT] for data accesses. The attempted execution of the eclwx or ecowx instructions 
while MSR[DT]=0 causes boundedly undefined results. 

6.5.2.3 Data Translation Enabled: MSR[DT]=1 

When data address translation is enabled (MSR[DT] = 1), data accesses employ one of the 
following translation mechanisms: 

• Page address translation 

• Block address translation 

The block and page address translation mechanisms locate the physical address for the 
access as described in Figure 6-3. 
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6.6 Direct Address Translation 

If address translation is disabled (MSR[IT] = or MSR[DT] = 0) for a particular access 
(fetch, load, or store), the logical address is treated as the physical address and is passed 
directly to the memory subsystem as a direct address translation. 

The addresses for accesses that occur in direct translation mode bypass all memory 
protection checks as described in Section 6.4, "General Memory Protection Mechanism," 
and do not cause the recording of reference and change information (described in 
Section 6.8.4, "Page History Recording"). Such accesses are performed as though the 
memory access mode bits ("WIM") were 001. That is, the cache is write-back and system 
memory does not need to be updated (W = 0), caching is enabled (I = 0), and data coherency 
is enforced with memory, I/O, and other processors (caches) (M=l so data is global). 

Whenever an exception occurs, the MPC601 clears both the MSR[IT] and MSR[DT] bits. 
Therefore, at least at the beginning of all exception handlers (including reset), the MPC601 
operates in direct address translation mode for instruction accesses (and data accesses that 
do not map to I/O controller interface space). If address translation is required for the 
exception handler code, the software must explicitly enable address translation by 
accessing the MSR as described in Chapter 2, "Registers and Data Types." 

Note that when translation is disabled, I/O controller interface segments can still be used 
for data accesses as the T bit of the segment registers is checked and segment registers with 
T=l are used independently of MSR[DT]. 

Note also that an attempt to fetch from, load from, or store to a physical address that is not 
physically present in the system may cause a machine check exception (or even a checkstop 
condition), depending on the response by the system for this case. See Section 5.4.2, 
"Machine Check Exception (x '00200')," for more information on machine check 
exceptions. 

6.7 Blocic Address Translation 

The block address translation (BAT) mechanism in the MPC601 provides a way to map 
ranges of logical addresses larger than a single page into contiguous areas of physical 
memory. Such areas can be used for data that is not subject to normal virtual memory 
handling (paging), such as a memory-mapped display buffer or an extremely large array of 
numerical data. 

The implementation of the block address translation in the MPC601 including the block 
protection mechanism is described followed by a block translation summary with a detailed 
flow diagram. 

6.7.1 BTLB Organization 

The block translation lookaside buffer (BTLB) of the MPC601 maintains the address 
translation information for four blocks of memory. The BTLB in the MPC601 is maintained 
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by the system software and is implemented as a set of eight special-purpose registers 
(SPRs). Each block is defined by a pair of SPRs called upper and lower BAT registers. 

The BAT registers can be read from or written to by the mfspr and mtspr instructions; 
access to the BAT registers is privileged. Section 6.7.3, "BAT Register Implementation of 
BTLB," gives more information about the BAT registers. Note that the BTLB entries are 
completely ignored for TLB invalidate operations detected on the system bus and in the 
execution of the tlbie instruction. 

Figure 6-5 shows the organization of the BTLB. Four pairs of BAT registers are provided 
for translating instruction and data addresses. These four pairs of BAT registers comprise 
the four-entry fully-associative BTLB (each BTLB entry corresponds to a pair of BAT 
registers). The BTLB is fully-associative in that all four entries are compared with the 
logical address of the access to check for a match simultaneously. 
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Figure 6-5. BTLB Organization 

Each pair of BAT registers defines the starting address of a block in the logical address 
space, the size of the block, and the start of the corresponding block in physical address 
space. If a logical address is within the range defined by a pair of BAT registers, its physical 
address is defined as the starting physical address of the block plus the lower order logical 
address bits. 

Blocks are restricted to a finite set of sizes, from 128 Kbytes (2^7 bytes) to 8 Mbytes (2"-^ 
bytes). The starting address of a block in both logical address space and physical address 
space is defined as a multiple of the block size. 

Because the BTLB entries are used for both instruction and data access, if the same memory 
address is to be mapped for both instruction fetching and data load and store operations, the 
address mapping must only be loaded into one register pair. 

It is an error for system software to program the BAT registers such that a logical address 
is translated by more than one BAT pair. If this occurs, the results are undefined and may 
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include a spurious violation of the memory protection mechanism, a machine check 
exception, or a check stop condition. 

6.7.2 Recognition of Addresses in BTLB 

The BTLB (BAT registers) is accessed in parallel with segmented address translation to 
determine whether a particular logical address corresponds to a block defined by the BTLB. 
If a logical address is within a valid BAT area, the physical address for the memory access 
is determined, as described in Section 6.7.5, "Block Physical Address Generation." 

Block address translation is enabled only when address translation is enabled (MSR[IT]=1 
and/or MSR[DT]=1) and only when the indexed segment register specifies T={). That is, the 
BAT does not apply to I/O controller interface segments (T=l). When the segment register 
has T=l, the segment register translation is used. (This is true for both I/O controller 
interface segments and memory-forced I/O controller interface segments.) 

The BAT registers and the segmented address translation mechanism can be programmed 
such that a particular logical address is within a BAT area and that logical address also has 
a segment register translation that corresponds to page address translation (T=0 in the 
segment register). When this occurs, the block address translation is used as shown in 
Table 6-10 and the segment address translation is ignored. 

Table 6-10. Address Translation Precedence for Blocks and Segments 



Segment Register 
Tbit 


Address Translation 





Matching BTLB entry prevails 


1 


Segment register prevails 



Additionally, a block can be defined to overlay part of a segment such that the block portion 
is non-paged although the rest of the segment is pageable. This allows non-paged areas to 
be specified within a segment, and PTEs for the part of the segment overlaid by the block 
are not required. 

6.7.3 BAT Register Implementation of BTLB 

Recall that the BTLB is comprised of four entries used for instruction accesses and data 
accesses. Each BTLB entry consists of a pair of BAT registers — an upper and a lower BAT 
register for each entry. The BAT registers are accessed with the mtspr and mfspr 
instructions and are only accessible to supervisor-level programs. See Section 3.7, 
"Processor Control Instructions," for a list of simplified mnemonics for use with the BAT 
registers. 

Figure 6-6 shows the format of the upper BAT registers and Figure 6-7 shows the format 
of the lower BAT registers. The format of the upper and lower BAT registers in the 
MPC6()1 differs from that of the BAT registers in other PowerPC processors. 
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Figure 6-6. Format of Upper BAT Registers 
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Figure 6-7. Format of Lower BAT Registers 

The BAT registers contain the logical to physical address mappings for blocks of memory. 
This mapping information includes the logical address bits that are compared with the 
logical address of the access, the memory/cache access mode bits (WIM) and the protection 
bits for the block. In addition, the size of the block and the starting address of the block are 
defined by the block page number and block size mask fields. 

Table 6-11 describes the bits in the upper and lower BAT registers. 

Table 6-11. BAT Registers— Field and Bit Descriptions 



Register 


Bits 


Name 


Description 


Upper 

BAT 

Registers 


0-14 


BLPI 


Block logical page index. This field is compared with bits 0-14 of the logical 
address to determine if there is a hit in that BTLB entry. 


15-24 


— 


Reserved 


25-27 


WIM 


Memory/cache access mode bits 

W Write-through 

1 Caching-inhibited 

M f\/lemory coherence 

For detailed information about the WIM bits, see Section 6.3, 

"Memory/Cache Access IVlodes." 


28 


Ks 


Supervisor mode key. This bit interacts with [\yiSR[PR] and the PP field to 
determine the protection for the block. For more information, see Section 6.4, 
"General Memory Protection Mechanism." 


29 


Ku 


User mode key. This bit also interacts with MSR[PR] and the PP field to 
determine the protection for the block. For more information, see Section 6.4, 
"General Memory Protection Mechanism." 


30—31 


PP 


Protection bite for block. This field interacts with MSR[PR] and the Ks or Ku 
to determine the protection for the block as described in Section 6.4, 
"General Memory Protection Mechanism." 
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Table 6-11. BAT Registers— Field and Bit Descriptions (Continued) 



Register 


Bits 


Name 


Description 


Lower 

BAT 

Registers 


0-14 


PBN 


Physical block number. This field is used in conjunction with the BSM field to 
generate bits 0-14 of the physical address of the block. 


15-24 


— 


Reserved 


25 


V 


BAT register pair (BTLB entry) is valid if V=1 


26-31 


BSM 


Block size mask (0...5). BSM is a mask that encodes the size of the block. 
Values for this field are listed in Table 6-12. 



The BSM field in the lower BAT register is a mask that encodes the size of the block. 
Table 6-12 defines the bit encodings for the BSM field of the lower BAT register. Note that 
the range of block sizes is a subset of that defined by the PowerPC architecture. 

Table 6-12. Lower BAT Register Block Size iVIask Encodings 



Blocl( Size 


BSIVI Encoding 


128 Kbytes 


00 0000 


256 Kbytes 


00 0001 


51 2 Kbytes 


00 0011 


1 Mbyte 


00 0111 


2 Mbytes 


00 1111 


4 Mbytes 


01 1111 


8 Mbytes 


11 1111 



Only the values shown in Table 6-12 are valid for BSM. A logical address is determined to 
be within a BAT area if the appropriate bits (determined by the BSM field) of the logical 
address matches the value in the BLPI field of the upper BAT register, and if the valid bit 
(V) of the corresponding lower BAT register is set. 

The boundary between the strings of zeros and ones in the BSM field determines the bits 
of the logical address that are used in the comparison with the BLPI field to determine if 
there is a hit in that BTLB entry. The rightmost bit of the BSM field is aligned with bit 14 
of the logical address; bits of the logical address corresponding to ones in the BSM field are 
then forced to zero for the comparison. 

The value loaded into the BSM field determines both the length of the block and the 
alignment of the block in both logical address space and physical address space. The values 
loaded into the BLPI and PBN fields must have at least as many low-order zeros as there 
are ones in BSM. 
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6.7.4 Block Memory Protection 

If a logical address is determined to be within a block defined by the BTLB, the access is 
next validated by the memory protection mechanism. If this protection mechanism 
prohibits the access, a block protection violation exception condition (data access 
exception or instruction access exception) is generated. 

The block protection mechanism provides protection at the granularity defined by the block 
size (128 Kbyte to 8 Mbyte) and is described in Section 6.4, "General Memory Protection 
Mechanism." 

The Ks, Ku, and PP bits are located in the upper BAT register for block address translation. 

When the block protection mechanism prohibits a reference, one of the following occurs, 
depending on the type of access that was attempted: 

• For data accesses, a data access exception is generated and bit 4 of DSISR is set. If 
the access was a store, bit 6 of DSISR is additionally set. 

• For instruction accesses, an instruction access exception is generated and bit 4 of 
SRRl is set. 

6.7.5 Block Physical Address Generation 

If the block protection mechanism validates the access, a physical address is formed as 
shown in Figure 6-8. Bits in the logical address corresponding to ones in the BSM field, 
concatenated with the 17 lower-order bits of the logical address form the offset within the 
block of memory in the case of a hit. Bits in the logical address corresponding to zeros in 
the BSM field are then logically ORed with the corresponding bits in the PBN field to form 
the next higher-order bits of the physical address. Finally, the highest-order nine bits of the 
PBN field form bits 0-8 of the physical address (PA()-PA8). 

Access to the physical memory within the block is made according to the memory/cache 
access mode defined by the WIM bits in the upper BAT register. These bits apply to the 
entire block rather than to an individual page and are described in Section 6.3, 
"Memory /Cache Access Modes." 
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Figure 6-8. Block Physical Address Generation 

6.7.6 Block Address Translation Summary 

Figure 6-9 provides the detailed flow for the block address translation mechanism. 
Figure 6-9 is an expansion of the "BTLB Hit" branch of Figure 6-3. Note that if the dcbz 
instruction is attempted to be executed with either W=l or I =1 , the alignment exception is 
generated. 
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Figure 6-9. Block Address Translation Flow 

Figure 6-10 further expands on the determination of a memory protection violation and the 
subsequent actions taken by the processor in this case. Note that in the case of a memory 
protection violation for the attempted execution of a debt of dcbtst instruction, the 
translation is aborted and the instruction executes as a no-op (no violation is reported). 

6.8 Memory Segment Model 

Memory in the MPC601 is divided into sixteen 256-Mbyte segments. The segmented 
memory model of the MPC601 provides a way to map 4-Kbyte pages of logical addresses 
to 4-Kbyte pages in physical memory (page address translation), while providing the 
programming flexibility afforded by a large virtual address (52 bits). 
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Figure 6-10. Memory Protection Violation Flow 

The page address translation mechanism may be superseded by the block address 
translation (BAT) mechanism described in Section 6.7, "Block Address Translation." If 
not, the translation proceeds in two steps: from logical address to the 52-bit virtual address 
(which never exists as a specific entity but can be considered to be the concatenation of the 
virtual page number and the byte offset within a page), and from virtual address to physical 
address. 

The implementation of the page address translation mechanism in the MPC601 is described 
followed by a summary of page address translation with a detailed flow diagram. 
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6.8.1 Page Address Translation Resources 

The page address translation performed by the MPC601 is facilitated by the 16 segment 
registers, which provide virtual address and protection information, and by the UTLB, 
which maintains 256 recently-used page table entries (PTEs). The segment registers are 
programmed by the operating system to provide the virtual ID for a segment. In addition, 
the operating system also creates the page tables in memory that provide the logical to 
physical address mappings (in the form of PTEs) for the pages in memory. 

As shown in Figure 6-11, when an access occurs, one of the 16 segment registers is selected 
with LA()-LA3. For page address translation, the virtual ID field in the segment register is 
then compared with the corresponding field of the two entries in the UTLB selected by 
LA13-LA19 (one entry corresponding to set and the other to set 1). In the case of a hit, 
the result of this comparison is then used to select which physical page number (PPN) (from 
set or 1 ) to use for the access. 
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Figure 6-11. Segment Register and UTLB Organization 

In the case of a UTLB miss, the table search hardware in the MMU automatically searches 
for the required PTE in the page tables in memory. The MMU then automatically loads the 
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UTLB with the PTE and the address translation is performed. Note that for an instruction 
access, the required PTE is also loaded into the ITLB for future use. 

If the table search operations fail to locate the required PTE, then the appropriate exception 
(instruction access exception or data access exception) is taken. See Section 6.9.2, "Page 
Table Search Operation" for more information on the context for these exception 
conditions. 

6.8.2 Recognition of Addresses in Segments 

As described in Section 6.7.2, "Recognition of Addresses in BTLB," the block and page 
translation mechanisms operate in parallel such that if the logical address of an access hits 
in the BTLB (the address can be translated as a block address), the selected segment register 
is ignored, unless T=l in the segment register. 

Segments in the MPC601 are defined as one of the following two types: 

• Memory segment — A logical address in these segments represents a virtual address 
that is used to define the physical address of the page. 

• I/O controller interface segment — References made to I/O controller interface 
segments use the I/O controller interface bus protocol described in Section 9.6, "I/O 
Controller Interface Operation," and do not use the virtual paging mechanism of the 
MPC601 . See Section 6. 1 0, "I/O Controller Interface Address Translation," for a 
complete description of the mapping of I/O controller interface segments. 

The T bit in a segment register selects between memory segments and I/O controller 
interface segments, as shown in Table 6-13. 

Table 6-13. Segment Register Types 



Segment Register 
TBit 


Segment Type 





Memory segment 


1 


I/O controller interface segment 



The types of address translation used by the MPC601 MMU are shown in the flow diagram 
of Figure 6-4. 

6.8.2.1 Selection of IVIemory Segments 

All accesses generated by the MPC6()1 index into the array of segment registers and select 
one of the 16 with LA()-LA3. If MSR[IT]=0 or MSR[DT]=0 for an instruction or data 
access, respectively, then direct address translation is performed as described in 
Section 6.6, "Direct Address Translation." Otherwise, if T=0 for the selected segment 
register, the access maps to memory space and page address translation is performed. 

After a memory segment is selected, the MPC601 creates the 52-bit virtual address for the 
segment and searches for the PTE (first in the UTLB, then in the page tables in memory) 
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that dictates the physical page number to be used for the access. Note that I/O devices can 
easily be mapped into memory space and used as memory-mapped I/O. 

6.8.2.2 Selection of I/O Controller Interface Segments 

All data accesses generated by the MPC601 index into the array of segment registers and 
select one of the 16 with LA()-LA3. If T=l for the selected segment register, the access 
maps to the 1/0 controller interface and the access proceeds as described in Section 6.10, 
"I/O Controller Interface Address Translation." This is true, even if data address translation 
is disabled (MSR[DT]=0). 

For the case of instruction accesses, however, the MPC601 checks the state of the MSR[IT] 
bit before checking the T bit in the segment register. If MSR[IT] = 0, direct address 
translation is performed as described in Section 6.6, "Direct Address Translation." If 
MSR[IT] = 1 and the T bit of the selected segment register is set, then the MMU further 
checks the state of the BUID field of the segment register. If BUID has the encoding x'07F', 
the segment is designated as a memory-forced I/O controller interface segment and the 
instruction fetch occurs as described in Section 6.10.4, "Memory-Forced I/O Controller 
Interface Accesses." Otherwise, an instruction access exception occurs. 

6.8.3 Page Address Translation 

The first step in page address translation is the conversion of the 32-bit logical address of 
an access into the 52-bit virtual address. The virtual address is then used to locate the PTE 
either in the UTLB or in the page tables in memory. The physical page number is then 
extracted from the PTE and used in the formation of the physical address of the access. 

Figure 6-12 shows the translation of a logical address to a physical address as follows: 

• Bits 0-3 of the logical address comprise the segment register number used to select 
a segment register, from which the virtual segment ID (VSID) is extracted. 

• Bits 4—1 9 of the logical address correspond to the page number within the segment; 
these are concatenated with the VSID from the segment register to form the virtual 
page number (VPN). The VPN is used to search for the PTE in either the UTLB or 
the page table. The PTE then provides the physical page number (PPN). 

• Bits 20-31 of the logical address are the byte offset within the page; these are 
concatenated with the PPN field of a PTE to form the physical address used to access 
memory. 
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Figure 6-12. Page Address Translation Overview 

6.8.3.1 Segment Register Definition 

The fields in the 16 segment registers are interpreted differently depending on the value of 
bit (the T bit). When T=l, the segment register defines an I/O controller interface 
segment, and the format is described in Section 6.10.1, "Segment Register Format for I/O 
Controller Interface." Figure 6-13 shows the format of a segment register used in page 
address translation (T=0). 
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31 



Figure 6-13. Segment Register Format for Page Address Translation 

Table 6-14 provides the definitions of the segment register bits for page address translation. 
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Table 6-14. Segment Register Bit Definition for Page Address Translation 



Bit 


Name 


Description 





T 


T=0 selects this format 


1 


Ks 


Supervisor-mode memory key 


2 


Ku 


User-mode memory key 


3-7 


— 


Reserved 


8-31 


VSID 


Virtual segment ID 



The Ks and Ku bits partially define the access protection for the pages within the segment. 
The page protection provided in the MPC601 is described in Section 6.8.5, "Page Memory 
Protection." The virtual segment ID field is used as the high-order bits of the virtual page 
number (VPN) as shown in Figure 6-12. 

The segment registers are programmed with MPC601 -specific instructions that implicitly 
reference the segment registers. The MPC601 segment register instructions are 
summarized in Table 6-15. These instructions are privileged in that they are executable 
only while operating in supervisor mode. See Section 2.3.3.1, "Synchronization for 
Supervisor-Level SPRs, and Segment Registers" for information about the synchronization 
requirements when modifying the segment registers. See Chapter 10, "Instruction Set," for 
more detail on the encodings of these instructions. 

Table 6-15. Segment Register Instructions 



Instruction 


Description 


mtsr SR#,rS 


Move to Segment Register 
SR[SR#]4- rS 


mtsrin rS.rB 


Move to Segment Register Indirect 
SR[rB[0-3]]<-rS 


mfsr rD,SR# 


Move from Segment Register 
rD<-SR[SR#] 


mfsrin rD.rB 


Move from Segment Register Indirect 
rD<-SR[rB[0-3]] 



Tliese instructions are specific to tine MPC601 and not 
guaranteed on otiier PowerPC processors. 

6.8.3.2 Page Table Entry (PTE) Format 

Page table entries (PTEs) are generated and placed in page tables in memory by the 
operating system using the hashing algorithm described in Section 6.9.1.3, "Hashing 
Functions." Each PTE is a 64-bit entity (two words) that maps one virtual page number 
(VPN) to one physical page number (PPN). Information in the PTE controls the table 
search process and provides input to the memory protection mechanism. Figure 6-14 shows 
the format of both words that comprise a PTE. 
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Figure 6-14. Page Table Entry Format 

Table 6-16 lists the bit definitions for each word in a PTE, 

Table 6-16. PTE Bit Definitions 



27 28 29 3031 



Word 


Bit 


Name 


Description 








V 


Entry valid (V=1) or invalid (V=0) 


1-24 


VSID 


Virtual segment ID 


25 


H 


Hash function identifier 


26-31 


API 


Abbreviated page index 


1 


0-19 


PPN 


Phiysical page number 


20-22 


— 


Reserved 


23 


R 


Reference bit 


24 


C 


Change bit 


25-27 


WIM 


Memory/cache control bits 


28-29 


— 


Reserved 


30-31 


PP 


Page protection bits 



All other fields are reserved. 

The PTE contains an abbreviated page index rather than the complete page index field 
because at least ten of the low-order bits of the page index are used in the hash function to 
select a PTEG address (the location of a PTE). These bits are not repeated in the PTEs of 
that PTEG. However, when a PTE is loaded into the UTLB, the entire page index (PI) field 
must be loaded into the UTLB entry. The PI field is then compared with incoming logical 
address bits LA4-LA12 (LA13-LA16 select the UTLB entries to be compared) to 
determine if there is a hit. 

The virtual segment ID field corresponds to the high-order bits of the virtual page number 
(VPN), and, along with the H bit, it is used to locate the PTE. The R and C bits maintain 
history information for the page as described in Section 6.8.4, "Page History Recording." 
The WIM bits define the memory/cache control mode for accesses to the page. Finally, the 
PP bits define the remaining access protection constraints for the page. The page protection 
provided in the MPC601 is described in Section 6.8.5, "Page Memory Protection," 
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Conceptually, the page table is searched by the address translation hardware to translate the 
address of every reference. For performance reasons, the UTLB maintains recently-used 
PTEs so that the table search time is eliminated for most accesses. The UTLB is searched 
for the address translation first. If the PTE is found, then no page table search is performed. 
As a result, software that changes the page tables in any way must perform the appropriate 
TLB invalidate operations to keep the UTLB (and ITLB) coherent with respect to the page 
tables. 

6.8.4 Page History Recording 

Reference (R) and change (C) bits are automatically maintained by the MPC601 in the PTE 
for each physical page (for accesses made with page table address translation) to keep 
history information about the page. This information can then be used by the operating 
system to determine which areas of memory to write back to disk when new pages must be 
allocated in main memory. Reference and change recording is not performed for 
translations made with the BAT or for accesses that correspond to 1/0 controller interface 
(T=l) segments. Furthermore, R and C bits are maintained only for accesses made while 
address translation is enabled (MSR[IT]=1 or MSR[DT]=1 ). 

The reference and change bits are automatically updated by the MPC601 under the 
following circumstances: 

• For UTLB hits, if the C bit requires updating (as shown in Table 6-16). 

• For UTLB misses, when a table search is in progress to locate a PTE. The R and C 
bits are updated (set, if required) to reflect the status of the page based on this access. 

Table 6-17. Table Search Operations to Update History Bits— UTLB Hit Case 



R and C bits 
in UTLB entry 


IVIPCeoi Action 


00 


Combination doesn't occur 


01 


Combination doesn't occur 


10 


Read: No special action 

Write: Table search operation to update C 


11 


No special action for read or write 



Note that the processor updates the C bit based only on the status of the C bit in the UTLB 
entry in the case of a UTLB hit (the R bit is assumed to be set in the page tables if there is 
a UTLB hit). Therefore, when software clears the R and C bits in the page tables in memory, 
it must invalidate the UTLB entries associated with the pages whose reference and change 
bits were cleared. See Section 6.9.3, "Page Table Updates," for all of the constraints 
imposed on the software when updating the reference and change bits in the page tables. 

The R bit or the C bit for a page is not set by the execution of the Data Cache Block Touch 
instructions (debt, or dcbtst). 
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6.8.4.1 Reference Bit 

The reference bit of a page is located both in the PTE in the page table and in the copy of 
the PTE loaded into the UTLB. Every time a page is referenced (with a read or write access) 
the reference bit is set in the page table by the MPC6()1 . Because the reference to a page is 
what causes a PTE to be loaded into the UTLB, the reference bit in all UTLB entries is 
always set. The processor never automatically clears the reference bit. 

The reference bit is only a hint to the operating system about the activity of a page. At times, 
the reference bit may be set although the access was not logically required by the program 
or even if the access was prevented by memory protection. Examples of this include the 
following: 

• Prefetching of instructions not subsequently executed 

• Accesses that cause exceptions and are not completed 

6.8.4.2 Change Bit 

The change bit of a page is also located both in the PTE in the page table and in the copy 
of the PTE loaded into the UTLB. Whenever a data store instruction is executed 
successfully, if the UTLB search (for page address translation) results in a hit, the change 
bit in the matching UTLB entry is checked. If it is already set, the processor does not 
change the C bit. If the UTLB change bit is 0, it is set and a table search operation is 
performed to set the C bit also in the corresponding PTE in the page table. 

The change bit (in both the UTLB and the PTE in the page tables) is set only when a store 
operation is allowed by the page memory protection mechanism. 

The automatic update of the reference and change bits in the MPC601 is performed with 
single-beat read and write transactions on the bus (not with atomic read/modify/write 
operations). 

During a table search operation, PTEs are fetched as global, nonexclusive read transactions 
(not as read-with-intent-to-modify transactions). This reduces cache thrashing in other 
processors (in a multiprocessor system) caused by UTLB load operations because other 
processors do not have to invalidate their resident copies of the PTEs. The response on the 
bus to a PTE load transaction should then be exclusive (SHD signal not asserted) if no other 
processor has a copy. Because PTEs are considered as cacheable, the MESI protocol of the 
cache then ensures that coherency is maintained among multiple processors for C bit 
updates to the page tables. 

6.8.5 Page l\/lemory Protection 

Similar to the block memory protection mechanism, the page memory protection of the 
MPC6()1 provides selective access to each page in memory. If a logical address is 
determined to be within a page defined by the segment registers and an entry in the UTLB, 
the access is next validated by the page protection mechanism. If this protection mechanism 
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prohibits the access, a page protection violation (data access exception or instruction access 
exception) is generated. 

When the page protection mechanism prohibits a reference, one of the following occurs, 
depending on the type of access that was attempted. 

• For data accesses, a data access exception is generated and bit 4 of DSISR is set. If 
the access was a store, bit 6 of DSISR is additionally set. 

• For instruction accesses, an instruction access exception is generated and bit 4 of 
SRRl is set. 

See Chapter 5, "Exceptions," for more information on these types of exceptions 

A store operation that is not permitted because of the page protection mechanism does not 
cause the change (C) bit to be set in the PTE (in either the UTLB or in the page tables in 
memory); however, a prohibited store access may cause a PTE to be loaded into the UTLB 
and consequently cause the reference bit to be set in a PTE (both in the UTLB and in the 
page table in memory). 

6.8.6 Page Address Translation Summary 

Figure 6-15 provides the detailed flow for the page address translation mechanism. The 
figure is an expansion of the "UTLB Hit" branch of Figure 6-3. The detailed flow for the 
"UTLB Miss" branch of Figure 6-3 is described in Section 6.9.2, "Page Table Search 
Operation." Note that as in the case of block address translation, if the dcbz instruction is 
attempted to be executed with either W=l or I =1, the alignment exception is generated. 
Also note that the memory protection violation flow for page address translation is identical 
to that of the block memory protection violation and is provided in Figure 6-10. 

6.9 Hashed Page Tables 

When an access that is to be translated by the page address translation mechanism results 
in a miss in the UTLB (a PTE corresponding to the VSID of the segment register is not 
resident in either of the UTLB entries indexed by LA13-LA19), the MPC601 automatically 
searches the page tables set up by the operating system in main memory. 

The algorithm used by the processor in searching the page tables includes a hashing 
function on some of the virtual address bits. Thus, the addresses for PTEs are allocated 
more evenly within the page tables and the hit rate of the page tables is maximized. This 
algorithm must be synthesized by the operating system for it to correctly place the page 
table entries in main memory. 

This section describes the format of the page tables and the algorithm used to access them. 
In addition, the constraints imposed on the software in updating the page tables (and other 
MMU resources) are described. 
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Figure 6-15. Page Address Translation Flow— UTLB Hit 

6.9.1 Page Table Definition 

The hashed page table is a variable-sized data structure that defines the mapping between 
virtual page numbers and physical page numbers. The page table size is a power of 2, and 
its starting address is a multiple of its size. 
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The page table contains a number of page table entry groups (PTEGs). A PTEG contains 
eight page table entries (PTEs) of eight bytes each; therefore each PTEG is 64 bytes long. 
PTEG addresses are entry points for table search operations. Figure 6-16 shows two PTEG 
addresses (PTEGaddrl and PTEGaddr2) where a given PTE may reside. 

Page Table 
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PTE1 












PTE7 


















PTEO 


PTE1 












PTE7 




PTEO 
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PTE7 



































PTEGn 



Figure 6-16. Page Table Definitions 

A given PTE can reside in one of two possible PTEGS. For each PTEG address, there is a 
complementary PTEG address — one is the primary PTEG and the other is the secondary 
PTEG. Additionally, a given PTE can reside in any of the PTE locations within an 
addressed PTEG. Thus, a given PTE may reside in one of 1 6 possible locations within the 
page table. If a given PTE is not resident within either the primary or secondary PTEG, a 
page table miss occurs, corresponding to a page fault condition. 

A table search operation is defined as the search of a PTE within a primary and secondary 
PTEG. When a table search operation commences, a primary hashing function is performed 
on the virtual address. The output of the hashing function is then concatenated with bits 
(some of them masked) programmed into the SDRl register by the operating system to 
create the physical address of the primary PTEG. The PTEs in the PTEG are then checked, 
one by one, to see if there is a hit within the PTEG. In case the PTE is not located during 
this PTEG, a secondary hashing function is performed, a new physical address is generated 
for the PTEG, and the PTE is searched for again, this time using the secondary PTEG 
address. 
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6.9.1.1 Table Search Description Register (SDR1) 

The SDRlregister contains the control information for the table structure in that it defines 
the highest order bits for the physical base address of the page table and it defines the size 
of the table. The format of the SDRl register is shown in Figure 6-17 and the bit settings 
are described in Table 6-18. 



15 16 22 23 

Figure 6-17. SDR1 Register Format 
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Table 6-18. SDRl Register Bit Settings 



Bits 


Name 


Description 


0-15 


HTABORG 


Physical address of page table 


16-22 


— 


Reserved 


23-31 


HTABMASK 


Mask for page table address 



The HTABORG field in SDRl contains the high-order 7-16 bits of the 32-bit physical 
address of the page table. Therefore, the beginning of the page table lies on a 2^6 byte (64 
Kbyte) boundary at a minimum. 

A page table can be any size 2" where 16 < n < 25. The HTABMASK field in SDRl 
contains a mask value that determines how many bits from the output of the hashing 
function are used as the page table index. This mask must be of the form b'(X)...01 l...r (a 
string of bits followed by a string of 1 bits). As the table size increases, more bits are 
used from the output of the hashing function to index into the table. The 1 bits in 
HTABMASK determine how many additional bits (beyond the minimum of 10) from the 
hash are used as the index; the HTABORG field must have the same number of lower-order 
bits equal to as the HTABMASK field has lower-order bits equal to 1. 

6.9.1.2 Page Table Size 

The number of entries in the page table directly affects performance because it influences 
the hit ratio in the page table and thus the rate of page fault exception conditions. If the table 
is too small, not all virtual pages that have physical page frames assigned may be mapped 
via the page table. This can happen if there are more than 1 6 entries that map to the saine 
primary/secondary pair of PTEGs; in this case, many hash collisions may occur. 

The minimum allowable size for a page table is 64 Kbytes (2^0 PTEGs of 64 bytes each). 
However, it is recommended that the total number of PTEGs (primary plus secondary) in 
the page table be greater than half the number of physical page frames to be mapped. While 
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avoidance of hash collisions cannot be guaranteed for any size page table, making the page 
table larger than the recommended minimum size reduces the frequency of such collisions, 
by making the primary PTEGs more sparsely populated, and further reducing the need to 
use the secondary PTEGs. 

Table 6-18 shows some example sizes for total main memory. The recommended minimum 
page table size for these example memory sizes are then outUned, along with their 
corresponding HTABORG and HTABMASK settings. Note that systems with less than 
eight Mbytes of main memory may be designed with the MPC601, but the minimum 
amount of memory that can be used for the page tables is 64 Kbytes. 



Table 6-19. Recommended Page Table Sizes (Minimum) 


Total Main Memotv 


Recommended Minimum 


Settings for 
Recommended Minimum 


Memory for 
Page Tables 


Number of 
Mapped 
Pages 
(PTEs) 


Number of 
PTEGs 


HTABORG 


HTABMASK 


8 Mbytes (2^3) 


64 Kbytes (2^^) 


2i3 


2I0 


X xxxx xxxx 


0000 0000 


16 Mbytes (a^^*) 


128 Kbytes (2^^) 


214 


2II 


X xxxx xxxO 


0000 0001 


32 Mbytes (22^) 


256 Kbytes (2^^) 


215 


2I2 


X xxxx xxOO 


0000 0011 


64 Mbytes (22^) 


512 Kbytes (2^9) 


216 


2I3 


x xxxx xOOO 


0000 0111 


128 Mbytes (2^^) 


1 Mbytes (2^°) 


217 


2'' 


X xxxx 0000 


00001111 


256 Mbytes {2^^) 


2 Mbytes (221) 


218 


215 


x xxxO 0000 


0001 1111 


512 Mbytes (229) 


4 Mbytes (2^2) 


219 


2I6 


X xxOO 0000 


0011 1111 


1 Gbytes (2^°) 


8 Mbytes (223) 


220 


217 


x xOGO 0000 


00111 1111 


2 Gbytes (2^^) 


16 Mbytes (22'') 


221 


218 


X 0000 0000 


01111 1111 


4 Gbytes (2^^) 


32 Mbytes (22^) 


222 


219 


0000 0000 


1 1111 1111 



As an example, if the physical memory size is 2^9 bytes (512 Mbyte), then there are 2^9- 
212 (4 Kbyte page size) = 2i7 (128 Kbyte) total page frames. If this number of page fraines 
is divided by 2, the resultant minimum recommended page table size is 2^^ PTEGs, or 2^^ 
bytes (4 Mbytes) of memory for the page tables. 

6.9.1.3 Hasiiing Functions 

The processor uses two different hashing functions, a primary and a secondary, in the 
creation of the physical addresses used in a page table search operation. These hashing 
functions efficiently distribute the PTEs within the page table, in that there are two possible 
PTEGs where a given PTE can reside. Additionally, there are eight possible PTE locations 
within a PTEG where a given PTE can reside. If a PTE is not found using the primary 
hashing function, the secondary hashing function is performed, and the secondary PTEG is 
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searched. Note that these two functions must also be used by the operating system to 
appropriately set up the page tables in memory. 

The use of the two hashing functions provides a high probability that a required PTE is 
resident in the page tables, without requiring the definition of all possible PTEs in main 
memory. However, if a PTE is not found in the secondary PTEG, then a page fault occurs 
and an exception is taken. Thus, the required PTE can then be placed into either the primary 
or secondary PTEG by the system software, and on the next UTLB miss to this page, the 
PTE will be found. 

The address of a page table is derived from the HTABORG field of the SDRl register, and 
the output of the corresponding hashing function (primary hashing function for primary 
PTEG and secondary hashing function for a secondary PTEG). The value in HTABMASK 
determines how many of the higher-order hash value bits are masked and how many are 
used in the generation of the physical address of the page table. 

Figure 6-18 depicts the hashing functions used by the MPC6()1. The inputs to the primary 
hashing function are the lower-order 19 bits of the VSID field of the selected segment 
register (bits 5-23 of the 52-bit virtual address), and the page index field of the logical 
address (bits 24-39 of the virtual address) concatenated with three zero higher-order bits. 
The XOR of these two values generates the output of the primary hashing function (hash 
value 1). 
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Primary Hash: 





5 






23 




Low-Order 19 Bits of VSID (From Segment Register) 




24 


XOR 


39 




mil 


Page Index (From Logical Address) 






= 






Output of Hashing Function 1 


Second 




1 


8 9 
II 


18 
1 


ary Hash: 





18 




Hash Value 1 






One's Complement Function 






Output of Hashing Function 2 






1 


8 9 
II 


18 



Hash Value 1 



Hash Value 2 



Figure 6-18. Hashing Functions 

When the secondary hashing function is required, the output of the primary hashing 
function is complemented with one's complement arithmetic, to provide hash value 2. 

6.9.1.4 Page Table Addresses 

Figure 6-19 illustrates the generation of the addresses used for accessing the hashed page 
tables defined for the MPC601. As stated earUer, the operating system must synthesize the 
table search algorithm for setting up the tables. 

As shown in Figure 6-19, two of the elements that define the 52-bit virtual address (the 
segment register VSID field and the page index field of the logical address) are used as 
inputs into a hashing function. Depending on whether the primary or secondary PTEG is to 
be accessed, the processor uses either the primary or secondary hashing function. 
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Virtual Page Number (VPN) 



23 24 29 30 



39 



52-Blt Virtual Address 



Virtual Segment ID 
(24-bit) 


API 1 
(6-bit) 1 


Byte Offset 
(12-bit) 



SDR1 

6 7 15 16 



Page Index (1 6-bit) 



(3-bit) 


000 


(16-bit) 



Hash Function 



22 23 



31 



XXXX XX 

(16-bit) 


.00 


OOQDOOO 


00. 


. ..Oil ...1 
(9-bit) 




Hash Value 
(19-bit) 



18 



Mask 



-I 9 bits 



V W 



10 bits 



AND 



I'—ll 



OR 



67 



(7-bit) 



15 16 



(9-bit) 



PAGE TABLE 



25 26 31 / 



(10-bit) 



{6-bft) 



PTEG Select 



32-bit Physical Address of Page Table Entry 



PTEO 






8 bytes 




PTE7 


PTEGO 


























4 


ft 
» 
ft 








PTEGn 



















PTE 
01 



24 25 26 31 



64 Bytes *- 

19 25 27 31 





VSID 
(24-bit) 




API 
(6-bit) 



Physical Page Number (PPN) 
(20-bit) 



m 



32-bJt Physical Address 



R WIM 
G 



■ PP 



PPN 
(20-bit) 



Byte Offset 
(12-bit) 



Figure 6-19. Generation of Addresses for Page Tables 
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The base address of the page table is defined by the higher order bits of SDRl [HTABORG]. 
Bits 7-1 5 of the PTEG address are derived from the masking of the higher-order bits of the 
hash value (as defined by SDRl [HTAB MASK]) concatenated with (implemented as an OR 
function) the remaining bits of SDR1[HTAB0RG]. Bits 1 6-25 of the PTEG address are the 
10 lower order bits of the hash value, and bits 26-31 of the PTEG address are zero. In the 
process of searching for a PTE, the processor first checks PTEO (at the PTEG base address). 

6.9.1.5 Page Table Structure 

In the process of searching for a PTE, the processor interprets the values read from memory 
as described in Section 6.8.3.2, "Page Table Entry (PTE) Format." The VSID and the 
abbreviated page index (API) fields of the 52-bit virtual address of the access are compared 
to those same fields of the PTEs in memory. In addition, the valid (V) bit and the hashing 
function (H) bit are also checked. For a hit to occur, the V bit of the PTE in memory must 
be set. If the fields match and the entry is valid, the PTE is considered a hit if the H bit is 
set as follows: 

• If this is the primary PTEG, H=0 

• If this is the secondary PTEG, H=l 

The physical address of the PTEs to be checked is derived as shown in Figure 6-19, and is 
the address of a group of eight PTEs (a PTEG). During a table search operation, the 
processor first compares the PTEO location defined by the primary hashing function. If the 
VSID and API fields do not match (or if V or H are not set appropriately), the processor 
increments the lower order address bits by eight bytes and checks the PTEl location and so 
on, until all eight PTEs in the PTEG have been checked. 

If no match is found, the secondary hashing function is performed, and the secondary 
PTEG address is derived. The eight PTEs within the secondary PTEG are then similarly 
checked. If the required PTE is not found in any of the 1 6 possible locations (the eight PTEs 
within the primary PTEG and the eight PTEs within the secondary PTEG), then a page fault 
occurs and an exception is taken. Thus, if a valid PTE is located in the page tables, the page 
is considered resident; if no matching (and valid) PTE is found for an access, the page is 
interpreted as non-resident (page fault) and the operating system must load the PTE (and 
possibly the page) into main memory. 

Note that for performance reasons, PTEs should be allocated by the operating system first 
beginning with the PTEO locations within the primary PTEG, then PTEl , and so on. If more 
than eight PTEs are required within the address space that defines a PTEG address, the 
secondary PTEG can be used. Nonetheless, it may be desirable to place the PTEs that will 
require most frequent access at the beginning of a PTEG and reserve the PTEs in the 
secondary PTEG for the least frequently accessed PTEs. 

6.9.1.5.1 Page Table Structure Example 

Figure 6-20 shows the structure of an example page table. The base address of this page 
table is defined by bits 0-13 in SDRl [HTABORG]; note that bits 14 and 15 of HTABORG 
must be zero because the lower order two bits of HTABMASK are ones. The addresses for 
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individual PTEGs within this page table are then defined by bits 14-25 as an offset from 
bits 0-13 of this base address. Thus the size of the page table is defined as 4096 PTEGs. 



Example: 



Given: SDR1 



HTABORG 



15 



HTABMASK 
23 31 



1 1 1 


1010 


0110 


0000 


0000 


0000 


0000 


0000 


0011 



Base Address 



Page Table 



$A600 0000 


PTEO 


PTE1 












PTE7 




















PTEGaddrl 


PTEO 


PTE1 












PTE7 






PTEGaddr2 


PTEO 


PTE1 












PTE7 







































PTEG4095 



14 



25 
—I 



31 



PTEGaddrl = 1010 0110 0000 OOaa aaaa aaaa aaOO 0000 



14 



25 



31 



PTEGaddr2 = 1010 0110 0000 OObb bbbb bbbb bbOO 0000 

Figure 6-20. Example Page Table Structure 

Two example PTEG addresses are shown in the figure as PTEGaddrl and PTEGaddr2. Bits 
14-25 of each PTEG address in this example page table are derived from the output of the 
hashing function (bits 26-31 are zero to start with PTEO of the PTEG). In this example, the 
'b' bits in PTEGaddr2 are the one's complement of the *a' bits in PTEGaddrl. If bits 14- 
25 of PTEGaddrl were derived by using the primary hashing function, then PTEGaddr2 
corresponds to the secondary PTEG. 
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Note, however, that the 'b' bits in PTEGaddr2 can also be derived from a combination of 
logical address bits, segment register bits, and the primary hashing function. In this case, 
then PTEGaddrl corresponds to the secondary PTEG. Thus, while a PTEG may be 
considered a primary PTEG for some logical addresses (and segment register bits), it may 
also correspond to the secondary PTEG for a different logical address (and segment register 
value). 

It is the value of the H bit in each of the individual PTEs that identities a particular PTE as 
either primary or secondary (there may be PTEs that correspond to a primary PTEG and 
PTEs that correspond to a secondary PTEG, all within the same physical PTEG address 
space). Thus, only the PTEs that have H=() are checked for a hit during a primary PTEG 
search. Likewise, only PTEs with H=I are checked in the case of a secondary PTEG search. 

6.9.1.5.2 PTEG Address Mapping Example 

Figure 6-21 shows an example of a logical address and how its address translation (the 
PTE) maps into the primary PTEG in physical memory. The example illustrates how the 
processor generates PTEG addresses for a table search operation; this is also the algorithm 
that must be used by the operating system in creating the page tables. 

In the example, the value in SDRl defines a page table at address x'0F98 (X)00' that 
contains 8192 PTEGs. The example logical address selects segment register (SRO) with 
the highest order four bits. The contents of SRO are then used along with bits 4-19 of the 
logical address to create the 52-bit virtual address. 

To generate the address of the primary PTEG, bits 5-23, and bits 24-39 of the virtual 
address are then used as inputs into the primary hashing function (XOR) to generate hash 
value 1 . The lower order 13 bits of hash value 1 are then concatenated with the higher order 
13 bits of HTABORG, defining the address of the primary PTEG (x'0F9F F980'). 
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HTABORG 



Example: 



15 



HTABMASK 
23 31 



Given: SDR1 



1 i 1 


0000 


1111 


1001 


1000 


0000 


0000 


0000 


0111 



19 20 



25 



31 



LA = x'00FFA01B': . 0000 ,. 0000 1111 1111 1010^ ^ 0000 0001 1011 , 



Segment Register Select 



x'C 



SRO 



0010 0000 1100 1010 0111 0000 0001 1100 



31 



Virtual Address: 



VSID 



Page Index ,, 




I 1 I 

1100 1010 0111 0000 0001 1100 0000 1111 1111 1010 0000 0001 1011 
I I 



Primary Hash: 



Hasli Value 1 



24 25 



OIQ 0111 0000 CK301 11D0 

XOft 
000 OOOO 1111 1111 1010 



9-bits 



O10 0111 nit 1110 0110 



39 



JL 



10-bits 



Primary PTEG Address: 

HTABORG 



13 



I 

0000 
X' 



1111 

F 



1001 
9 



16 



25 Start at PTEO 



111 



1111 

F 



1001 
9 



— I 1 

1000 0000 
8 0' 



Figure 6-21. Example Primary PTEG Address Generation 

Figure 6-22 shows the generation of the secondary PTEG address for this example. If the 
secondary PTEG is required, the secondary hash function is performed and the lower order 
13 bits of hash value 2 are then concatenated with the higher order 13 bits of HTABORG, 
defining the address of the secondary PTEG (x'()F98 0640'). 

As described in Figure 6-19, the 10 lower-order bits of the page index field are always used 
in the generation of a PTEG address (through the hashing function). This is why only the 
abbreviated page index (API) is defined for a PTE (the entire page index field does not need 
to be checked). For a given logical address, the lower order 10 bits of the page index (at 
least) contribute to the PTEG address (both primary and secondary) where the 
corresponding PTE may reside in memory. Therefore, if the higher order 6 bits (the API 
field) of the page index match with the API field of a PTE within the specified PTEG, the 
PTE mapping is guaranteed to be the unique PTE required. 
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Primary PTEG Address: 

HTABORG 



I 1 

0000 1111 1001 jlllj 
X' F 9 F 



16 



25 Start at PTEG 



1111 1001 1000 0000 
F 9 8 0' 



Hash Value 2 101 lOOP OqOO 1111 

J I 



9-bits 



0001 1001 



10-bits 



Secondary PTEG Address: 

HTABORG 



0000 1111 1001 
X' F 9 



^OOOl 0000 0110 0100 0000 
8 6 4 0' 



1) First compare 8 PTEs 
at x'0F9F F980' 



2) Then, compare 8 PTEs 

at x'0F98 0640', 

if necessary 



x'0F98 0000' 




-►x'0F98 0640' 


PTEO 














PTE? 








♦-x'0F9FF980' 


FTEO 














PTE? 







PTEGO 
PTEG25 

PTEG8166 
PTEG8191 



Figure 6-22. Example Secondary PTEG Address Generation 

Note that a given PTEG address does not map back to a unique logical address. Not only 
can a given PTEG be considered both a primary and a secondary PTEG (as described in 
Section 6.9.1.5.1, "Page Table Structure Example"), but in this example, bits 24-26 of the 
page index field of the virtual address are not used to generate the PTEG address. 
Therefore, any of the eight combinations of these bits will map to the same primary PTEG 
address. (However, these bits are part of the API and are therefore compared for each PTE 
within the PTEG to determine if there is a hit.) Furthermore, a logical address can select a 
different segment register with a different value such that the output of the primary (or 
secondary) hashing function happens to equal the hash values shown in the example. Thus 
these logical addresses would also map to the same PTEG addresses shown. 

6.9.2 Page Table Search Operation 

An outline of the table search process performed by the MPC601 in the search of a PTE is 
as follows: 

1 . The 32-bit physical address of the primary PTEG is generated as described in 
Section 6.9.1.4, "Page Table Addresses". 

2. The first PTE (PTEO) in the primary PTEG is read from memory. 
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3. The PTE in the selected PTEG is tested for a match with the virtual page number 
(VPN) of the access. The VPN is the VSID concatenated with the page index fields 
of the virtual address. For a match to occur, the following must be true: 

— PTE[H] = () 

— PTE[V] = 1 

— PTE[VSID] = VA[0-23] 

— PTE[API] = VA[24-29] 

4. If a match is not found, step 3 is repeated for each of the other seven PTEs in the 
primary PTEG. If a match is found, go to step 8. If a match is not found within the 
8 PTEs of the primary PTEG, the address of the secondary PTEG is generated. 

5. The first PTE (PTEO) in the secondary PTEG is read from memory. 

6. The PTE in the selected secondary PTEG is tested for a match with the virtual page 
number (VPN) of the access. For a match to occur, the following must be true: 

— PTE[H] = 1 

— PTE[V] = 1 

— PTE[VSID] = VA[0-23] 

— PTE[API] = VA[24-29] 

7. If a match is not found, step 6 is repeated for each of the other seven PTEs in the 
secondary PTEG. 

8. If a match is found, the PTE is written into the UTLB and the R bit is updated in the 
PTE in memory (if necessary). If there is no memory protection violation, the C bit 
is also updated in memory and the table search is complete. 

9. If a match is not found within the 8 PTEs of the secondary PTEG, the search fails, 
and a page fault exception condition occurs (either an instruction access exception 
or a data access exception). 

Reads from memory for table search operations are performed as global (but not exclusive), 
cacheable operations, and are loaded into the on-chip cache of the MPC601 . 

Figure 6-23 and Figure 6-24 provide detailed flow diagrams of the table search operations 
performed by the MPC601 . Figure 6-23 shows the case of a dcbz instruction that is 
executed with W=l or I =1 , and that the R bit is updated in memory (if required) before the 
alignment exception occurs. The R bit is also updated (if required) in the case of a memory 
protection violation except for the case of a debt or a dcbtst instruction. If either of these 
instructions is executed and a protection violation occurs, the translation is simply aborted, 
the R bit is not set in memory and the instruction execution becomes a no-op (not shown in 
the figure). 
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Figure 6-23. Primary Table Searcli Flow 
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r Secondary Table Search j 



Generate PA using Secondary Hash Function 

PAf-BasePAof PTEG 

(PA26-PA31 =0) 
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(Fetch next PTE in PTEG) 



Fetch PTE from PTEG 
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otherwise 
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(Last PTE in PTEG) 



Instruction Access 



Page Fault 
Data Access 



(Secondary Table A 
Search Hit y 
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C Instruction Access A f Data Access A 
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Figure 6-24. Secondary Table Search Flow 

6.9.3 Page Table Updates 

This section describes the requirements on the software when updating page tables in 
memory via some pseudo-code examples. In a multiprocessor system the rules described 
in this section must be followed so that all processors operate with a consistent set of page 
tables. Even in a single processor system, certain rules must be followed, regarding 
reference and change bit updates, because software changes must be synchronized with 
automatic updates made by the hardware. Updates to the tables include the following 
operations: 

• Adding a PTE 

• Modifying a PTE, including modifying the R and C bits of a PTE 

• Deleting a PTE 

PTEs must be 'locked' on multiprocessor systems. Access to PTEs must be appropriately 
synchronized by software locking of (i.e., guaranteeing exclusive access to) PTEs or 
PTEGs if more than one processor can modify the table at that time. In the examples below, 
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"lockQ" and "unlockQ" refer to software locks that must be performed to provide exclusive 
status for the PTE being updated. See Appendix G, "Synchronization Programming 
Examples," for more infonnation about the use of the Iwarx and stwcx. instructions to 
perform software interlocks. 

On single processor systems, PTEs need not be locked. To adapt the examples given below 
for the single processor case, simply delete the "lockQ" and "unlockQ" lines from the 
examples. The sync instructions shown are required even for single processor systems. 

The UTLB (and ITLB) are non-coherent caches of the page tables. UTLB entries must be 
flushed explicitly with the TLB invalidate entry instruction (tlbie) whenever the 
corresponding PTE is modified. In a multiprocessor system, the tlbie instruction must be 
controlled by software locking, so that the tlbie is issued on only one processor at a time. 
The sync instruction causes the processor to wait until the TLB invalidate operation in 
progress by this processor is complete. 

The PowerPC architecture defines the tlbsync instruction (an illegal instruction in the 
MPC601) that ensures that TLB invalidate operations executed by this processor have 
caused all appropriate actions in other processors on the system bus. In a system that 
contains both MPC60I processors and other PowerPC processors, the tlbsync functionality 
must be emulated for the MPC601 in order to ensure proper synchronization with the other 
PowerPC processors. 

Any processor, including the processor modifying the page table, may access the page table 
at any time in an attempt to reload a UTLB entry. An inconsistent page table entry must 
never accidentally become visible; thus there must be synchronization between 
modifications to the valid bit and any other modifications. This requires as many as two 
sync operations for each PTE update. 

The MPC60I writes reference and change bits with unsynchronized, atomic byte store 
operations. Note that the V, R, and C bits each resides in a distinct byte of a PTE. Therefore, 
extreme care must be taken to ensure that no store operation inadvertently overwrites one 
of these bytes. 

6.9.3.1 Adding a Page Table Entry 

Adding a page table entry requires only a lock on the PTE in a multiprocessor system. The 
bytes in the PTE are then written, except for the valid bit. A sync instruction then ensures 
that the updates have been made to memory, and lastly, the valid bit is set. 

lock(PTE) 

PTE[VSID,H,API] <r- new values 

PTE[PPN,R,C,WIM,PP] <- new values 

sync 

PTE[V] <- 1 

unlock(PTE) 
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6.9.3.2 Modifying a Page Table Entry 

This section describes several scenarios for modifying a PTE. 

6.9.3.2.1 General Case 

In the general case, a currently-valid PTE must be changed. To do this, the PTE must be 
locked, marked invalid, flushed from the TLB, updated, marked valid again, and unlocked. 
The sync instruction must be used at appropriate times to wait for modifications to 
complete. 

Note that the tibsync and the sync instruction that follow are only required if compatibility 
is must be maintained with other PowerPC processors that implement the tibsync 
instruction. The tibsync instruction is not implemented in the MPC6()1 but can be emulated 
in the illegal instruction exception handler, 

lock(PTE) 

PTE[V]<- 

sync 

tlbie(PTE) 

sync 

tibsync 

sync 

PTE[VSID,H,APr] ^ new values 

PTE[PPN,R,C,WIM,PP] ^ new values 

sync 

PTE[V] ^ 1 

unlock(PTE) 

6.9.3.2.2 Clearing the Reference (R) Bit 

When the PTE is modified only to clear the R bit to 0, a much simpler algorithm suffices 
because the R bit need not be maintained exactly. 

lock(PTE) 

oldR 4- PTE[R] 

PTE[R] ^ 

if OldR =1, then tlbie(PTE) 

unlock(PTE) 

Since only the R and C bits are modified by the processor, and since they reside in different 
bytes, the R bit can be cleared by reading the current contents of the byte in the PTE 
containing R (bits 16-23 of the second word), ANDing the value with x'FE', and storing 
the byte back into the PTE. 

6.9.3.2.3 Modifying the Virtual Address 

If the virtual address is being changed to a different address within the same hash class 
(primary or secondary), the following flow suffices: 
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lock(PTE) 

val ^ PTE[VSID,API,H,V] 

val f- new VSID 

PTE[VSID,API,H,V] ^ val 

sync 

tlbie(PTE) 

sync 

tlbsync 

sync 

unlock(PTE) 

In this pseudo-code flow, note that the store into the first word of the PTE is performed 
atomically. Also, the tlbsync and the sync instruction that follow are only required if 
compatibility is must be maintained with other PowerPC processors that implement the 
tlbsync instruction. The tlbsync instruction is not implemented in the MPC6()1 but can be 
emulated in the illegal instruction exception handler. 

6.9.3.3 Deleting a Page Table Entry 

In this example, the entry is locked, marked invalid, invalidated in the TLBs, and unlocked. 

Again, note that the tlbsync and the sync instruction that follow are only required if 
compatibility is must be maintained with other PowerPC processors that implement the 
tlbsync instruction. The tlbsync instruction is not implemented in the MPC601 but can be 
emulated in the illegal instruction exception handler. 

lock(PTE) 

PTE[V] ^ 

sync 

tlbie(PTE) 

sync 

tlbsync 

sync 

unlock(PTE) 

6.9.4 Segment Register Updates 

There are certain synchronization requirements for using the move to segment register 
instructions. These are described in Section 2.3.3.1 , "Synchronization for Supervisor-Level 
SPRs, and Segment Registers." 

6.10 I/O Controller Interface Address Translation 

An I/O controller interface segment is a mapping of logical addresses to the I/O controller 
interface bus protocol. I/O controller interface segments are provided for POWER 
compatibility. Applications that require low-latency load/store access to external address 
space should use memory-mapped I/O, rather than the I/O controller interface. 
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A logical address within the I/O controller interface space corresponds to a segment register 
which has T=l. For more details about memory references to I/O controller interface 
segments, refer to Chapter 9, "System Interface Operation." 

As a subset of I/O controller interface address translation, the MPC601 also provides a way 
to force I/O controller interface accesses to be made to memory. This memory-forced I/O 
controller interface capability allows a 256-Mbyte segment of memory to be mapped with 
only one segment register and no page translation overhead. Note that this functionality 
may not be provided in other PowerPC processors. 

6.10.1 Segment Register Format for I/O Controller Interface 

Figure 6-25 shows the register format for the segment registers when the T bit is set. 



T 


Ks 


Ku 


BUID 


Controller Specific 



12 3 11 12 

Figure 6-25. Segment Register Format for I/O Controller Interface 

Table 6-20 shows the bit definitions for the segment registers when the T bit is set. 
Table 6-20. Segment Register Bit Definitions for I/O Controller Interface 
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Bit 


Name 


Description 





T 


T=1 selects this format 


1 


Ks 


Supervisor mode memory key 


2 


Ku 


User mode memory key 


3-11 


BUID 


Bus unit ID 


12-31 


Controller 
specific 


Device-dependent data for I/O 
controller 



6.10.2 I/O Controller Interface Accesses 

When the address translation process determines that the segment has T=l, I/O controller 
interface address translation is selected and any match due to block address translation (see 
Section 6.7, "Block Address Translation") is ignored. Additionally, no reference is made 
to the page tables. The following data is sent to the memory controller in the protocol (two 
packets consisting of address-only cycles) described in Section 9.6, "I/O Controller 
Interface Operation": 

• PacketO 

— One of the Kx bits (Ks or Ku) is selected to be the key as follows: 

- For supervisor accesses (MSR[PR]=0), the Ks bit is used and Ku is ignored 

- For user accesses (MSR[PR]=1), the Ku bit is used and Ks is ignored 
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— The contents of bits 3-31 of the segment register, which is the BUID field 
concatenated with the "controller-specific" field. 

• Packetl — SR[28-3 1 1 concatenated with the 28 lower-order bits of the logical 
address, LA4-LA31. 

The WIM bits for I/O controller interface accesses are forced to b'OlO'. Some instructions 
cause multiple address/data transactions to occur on the bus. The address for each 
transaction is handled individually with respect to the MMU. 

6.10.3 I/O Controller Interface Segment Protection 

Page-level protection as described in Section 6.8.5, "Page Memory Protection," is not 
provided by the MPC601 for I/O controller interface segments. The appropriate key bit (Ks 
or Ku) from the segment register is sent to the memory controller, and the memory 
controller implements any protection required. Frequently, no such mechanism is provided; 
the fact that a I/O controller interface segment is mapped into the address space of a process 
may be regarded as sufficient authority to access the segment. 

6.10.4 Memory-Forced I/O Controller Interface Accesses 

The MPC601 performs memory-forced I/O controller interface accesses when the T bit in 
the selected segment register is set and the BUID field in the segment register is x'OVF'. In 
this case, the processor bypasses all protection mechanisms and generates a memory access 
with the physical address specified by the lowest-order four bits in the segment register 
(SR[28-31]) concatenated with LA4-LA31. In this case, the processor assumes the WIM 
bits to be '01 1 ', denoting the access as cache-inhibited and global. 

6.10.5 Instructions Not Supported in I/O Controller Interface 
Segments 

The following instructions are not supported when issued with a logical address that selects 
a segment register that has T=l: 

• Iwarx 

• stwcx. 

• Iscbx 

If one of the above instructions is executed with a logical address corresponding to a 
segment with T=l, a data access exception occurs and DSISR[5] is set. 

The following instructions are not supported at all and cause boundedly undefined results 
when issued with a logical address that selects a segment register that has T=l (or when 
MSR[DT]=0): 

• eciwx 

• ecowx 
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6.10.6 Instructions with No Effect in I/O Controller Interface 
Segments 

The following instructions are executed as no-ops when issued with a logical address that 
selects a segment where T=l : 

debt 

dcbtst 

dcbf 

dcbi 

dcbst 

dcbz 



6.10.7 I/O Controller Interface Summary Flow 

Figure 6-26 shows the flow used by the MMU when I/O controller interface address 
translation is selected. This figure expands the I/O Controller Interface Translation stub 
found in Figure 6-4 for both instruction and data accesses. 
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I/O Controller 
Interface Translation 



otherwise 



T=1 



SR[BUID] = X'07F' (Memory-Forced) 



Instruction Access 



(Instruction Access^ 
Exception J 



Data Access 



Iwarx, stwcx.,or 
Iscbx Instruction 




PA0-PA31 <- SR[28-31] || LA4- 
LA31 



Floating-Point 
Load or Store 



(Memory Access A 
Performed ) 



otherwise 



otherwise 



nMlgnment Exception J 




otherwise 



Cache Instruction 
(debt, dcbtst, debt, 
dcbi, dcbst, or dcbz) 



PA0-PA31 <- SR[28-31] || LA4- 
LA31 



c 



No-Op 



/Perform I/O Controlled 
\. Interface Access J 

Figure 6-26. I/O Controller Interface Translation Flow 



3 



MOTOROLA 



Chapter 6. Memory Management Unit 



6-63 



6-64 PowerPC 601 RISC Microprocessor User's Manual MOTOROLA 



Chapter 7 
Instruction Timing 



This chapter describes instruction prefetch and execution through all of the execution units 
of the MPC601 processor. It also provides examples of instruction sequences showing 
concurrent execution and various register dependencies to illustrate timing interactions. 
Bus signals described in this chapter are only accurate to within half-clock cycle 
increments. Refer to Chapter 9, "System Interface Operation," for more specific 
information regarding bus operation timing. Instruction mnemonics used in this chapter can 
be identified by referring to Chapter 10, "Instruction Set." 

7.1 Instruction Timing Overview 

The MPC6()1 processor has been designed to minimize average instruction execution 
latency. Latency is defined as the number of clock cycles necessary to execute an 
instruction and make ready the results of that execution for a subsequent instruction. For 
the majority of instructions in the MPC601, this can be simplified to include only the 
execute phase for a particular instruction. However, data access instructions require 
additional clock cycles between the execute phase and the writeback phase due to memory 
latencies. 

In accordance with this definition, logical, bit-field, and most integer instructions have a 
latency of one clock cycle (for example, results for these instructions are ready for use on 
the next clock cycle after issue). Other instructions, such as the integer multiply, require 
more than one clock cycle to complete execution. 

Effective throughput of more than one instruction per clock cycle can be realized by the 
many performance features in the MPC601 including pipelining, superscalar instruction 
issue, branch acceleration, and multiple execution units that operate independently and in 
parallel. 

Many of the execution units on the MPC601 are said to be pipelined. This implies that the 
particular execution unit is broken into stages. Each stage performs a specific step, which 
contributes to the overall execution of an instruction. The pipelined design is analogous to 
an assembly line where workers perform a specific task and pass the partially complete 
product to the next worker. 
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When an instruction is issued to a pipelined execution unit, the first stage in the pipeline 
begins its designated work on that instruction. As an instruction is passed from one stage in 
the pipeline to the next, evacuated stages may accept new instructions. This design allows 
a single execution unit to be working on several different instructions simultaneously. Once 
the pipeline has been tilled with instructions, the execution unit completes a multi-cycle 
instruction every clock. 

Figure 7-1 shows a graphical representation of a generic pipelined execution unit. 



CLOCK (STAGE 1) A (STAGE 2) 



(STAGE 3) 



CLOCK 1 (STAGE 1) B (STAGE 2) A (STAGES) 



CLOCK 2 (STAGE 1) C (STAGE 2) B (STAGE 3) A 



CLOCK 3 (STAGE 1) 



(STAGE 2) A (STAGE 3) B 



Figure 7-1. Pipelined Execution Unit 

If the number of stages in each pipeline is equal to the total latency in clock cycles of its 
respective execution unit, the processor can continuously issue instructions to the same 
execution unit without stalling. Thus, when enough instructions have been issued to an 
execution unit to fill its pipeline, the first instruction will have completed execution and 
exited the pipeline, allowing subsequent instructions to be issued into the tail of the pipeline 
without interruption. 

The MPC601 is capable of retiring three instructions on every clock cycle. In general, 
instruction processing is accomplished in four stages — the prefetch stage, the decode stage, 
the execute stage, and the writeback stage. The instruction prefetch stage includes the clock 
cycles necessary to request instructions from the on-chip cache as well as the time it takes 
the on-chip cache to respond to that request. The decode stage consists of the time it takes 
to fully decode the instruction. Each of the three execution units on the MPC601 implement 
this general pipeline model slightly differently. These details are explained in the following 
paragraphs. 

In the writeback stage, results are returned to the register file. This stage generally does not 
contribute to the overall execution time. Instructions are prefetched and executed 
concurrently with the execution and writeback of previous instructions producing an 
overlap period between instructions. 

7.2 Timing Considerations of the MPC601 

A superscalar machine is one that can issue multiple instructions concurrently from a 
conventional linear instruction stream. The MPC601 is a true superscalar implementation 
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of the PowerPC architecture since three instructions can be issued to multiple execution 
units during each clock cycle. Although a superscalar implementation complicates 
instruction timing, these complications are largely transparent to the software. The 
MPC601 provides the logical functionality of issuing only a single instruction at a time, 
while providing the increased performance of issuing multiple instructions at a time. 

The execution unit pipelines are hardware interlocked, therefore, data dependencies 
automatically stall instruction issue without software assistance. This hardware 
interlocking mechanism eliminates the need to schedule wasteful no-op instructions. 

When an instruction is issued, the register file places the appropriate source data on the 
appropriate source bus. The corresponding execution unit then reads the data from the bus. 
The register files and source buses have sufficient bandwidth to sustain the peak issue rate 
of three instructions per clock. 

The MPC601 contains the following execution units that operate independently and in 
parallel: 

• Branch processing unit (BPU) 

• 32-bit integer unit (lU) 

• 64-bit floating-point unit (FPU) 

When the lU finishes executing an instruction, it places the resulting data, if any, onto one 
of the writeback buses. The results are then stored into the correct general-purpose register. 
If a subsequent instruction is waiting for this data, it is forwarded past the register file, 
directly into the appropriate execution unit for the immediate execution of the waiting 
instruction. This allows a data-dependent instruction to be decoded without waiting for the 
data to be written into the register file and then read back out again. This feature, known as 
feed forwarding, significantly shortens the time the machine may stall on data 
dependencies. 

When the FPU finishes executing an instruction, it places the resulting data, if any, onto one 
of the writeback buses. The results are then stored into the correct floating-point register. If 
a subsequent instruction is waiting for this data, that instruction must wait for the data to be 
written into the floating-point register file. On the next clock cycle, the following 
instruction may begin decode. In other words, the floating-point execution unit is not 
equipped with a feed-forwarding mechanism. The exception to this point is when a floating- 
point instruction is waiting for data from a floating-point load operation. In this case, the 
floating-point operation may begin decode during the same clock cycle as the floating-point 
register file is being updated. 

When the BPU finishes executing an instruction, it places the resulting data, if any, onto 
one of the writeback buses. The results are then stored into the correct special-purpose 
register. If a subsequent instruction is waiting for this data, it is forwarded past the register 
file, directiy into the appropriate execution unit for the immediate execution of the waiting 
instruction. This allows a data-dependent instruction to be decoded without waiting for the 
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data to be written into the register file and then read back out again. The feed forwarding 
feature significantly shortens the time the machine may stall on data dependencies. 

7.2.1 Instruction Queue (IQ) 

The instruction queue (IQ) contains instructions prefetched from the current instruction 
stream. Instructions enter the IQ and are issued to the various execution units from the IQ. 
The IQ is an eight-entry queue, which is the backbone of the master pipeline for the 
microprocessor. The MPC601 tries to keep the IQ full at all times. As previously 
mentioned, a maximum of three instructions can be dispatched during a single clock cycle. 
If while topping off the IQ, the request for new instructions misses in the on-chip cache, 
then a memory access will only occur if the IQ is half empty. In other words, if the IQ is 
only trying to fill its top half (only needs one to four instructions), and that instruction is 
not found in the on-chip cache, a memory access will not occur. However, if the IQ is trying 
to fill the top five entries (5 instructions are needed), and those instructions are not found 
in the on-chip cache, arbitration for a memory access will begin. The figures in this chapter 
show the changes in the IQ due to the execution of the flow-control instructions; they do 
not show the dynamic state of the IQ or the "topping off effect. 

Instructions enter the IQ through entry 7 and filter down to entry 0. The prefetch bus 
between the IQ and the on-chip cache is wide enough for eight instructions to be brought 
into the IQ simultaneously; that is, the IQ can go from being completely empty to 
completely full in one clock cycle. Note that although a maximum of eight instructions can 
be brought in from the on-chip cache in a single clock cycle, some restrictions do occur. 
Specifically, only the instructions between the instruction requested and the last word in the 
mod-32 aligned block are fed into the IQ. For example, if the BPU requests a block of 
instructions starting at address x'lO', then instructions contained in the block from x'lO' to 
x'20' will be sent to the IQ. 

Each of the execution units pulls instructions out of the IQ from specified entries. For 
example, integer instructions are only dispatched from the IQ through entry 0. In fact, IQ- 
is also the decode stage for the integer unit. Floating-point instructions may be dispatched 
to the FPU from entries ()-3. The branch processing unit also may pull instructions from 
the IQ from entries 0-3. 

7.2.2 General Instruction Flow 

Instructions are said to "issue" from the IQ to the appropriate execution unit. Although 
there are only three execution units that pull instructions out of the IQ, each execution unit 
may have several paths from which to pull instructions out of the IQ. Figure 7-2 shows how 
instructions can be pulled from the IQ and how those instructions progress through the 
various execution units. Note that Figure 7-2 is not a data flow diagram, rather it simply 
shows the various stages in the processor. 
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Figure 7-2. Instruction Stages 

Once an instruction has been pulled from the IQ by the branch processing unit, the 
remaining instructions found above the one just pulled by the BPU will be shifted down by 
one element. Once the BPU has pulled an instruction from the IQ, that instruction is placed 
into the decode/execute stage of the BPU. The branch is either executed and resolved,, or is 
predicted. Once a branch instruction has been executed, it may need to update a special 
purpose register. In that case, the branch instruction will do its writeback sometime after 
the decode/execute phase. If no writeback is needed, the branch instruction is retired. 

An integer instruction cannot issue to the lU until that instruction has filtered to the bottom 
element of the IQ. Once the integer instruction has been properly decoded, then it moves 
into the next phase of the lU pipeline. If the execute phase of the lU is not available, then 
the integer instruction may be issued into a one-entry buffer that lies between IQ-O (the 
decode stage for the integer unit) and the execute phase of the integer unit. If the execute 
phase of the lU is available, the instruction will fall through the integer unit buffer into the 
execute phase with no delay. It is important to note that if a data dependency is present, then 
the instruction will be held in the execute stage of the lU pipeline. Once the data 
dependency is resolved, the pending instruction will continue execution and will proceed 
thought the lU pipeline. This is important because if an instruction stalls in the lU pipeline, 
it will stall in the execute stage, leaving the decode stage and the buffer open for subsequent 
integer instructions to be issued to. Once execution is complete, the integer instruction is 
moved into the writeback stage where results are written into the register tile. 
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Roating-point instructions can issue to the FPU from any one of the bottom four elements 
of the IQ. Once a floating-point instruction has been pulled from the IQ, it moves into the 
floating-point buffer. The floating-point execution unit contains a one-entry buffer that lies 
between the IQ and the floating-point decode stage. All floating-point instructions must 
spend at least one clock cycle in the floating-point buffer. If the decode stage remains 
occupied after an instruction has already spent one clock cycle in the floating-point buffer, 
then the floating-point instruction will remain in the buffer. If both the decode stage and the 
buffer are occupied, then no floating-point instruction may be pulled from the IQ. The one- 
entry buffer in the FPU pipeline helps minimize any penalties when certain instructions do 
not move from one stage in the pipeline to another in a single clock cycle. 

Certain floating-point instructions may spend multiple clock cycles in the decode/execute 
phases of the floating-point unit. When a double-precision multiply instruction is 
encountered, it will spend a minimum of two clock cycles in each phase of the floating- 
point execution unit pipeline — two clock cycles in the decode phase, a minimum of two 
clock cycles in FPU execute 1 , and a minimum of two clock cycles in FPU execute 2, When 
one of these instructions enters the floating-point execution unit pipeline, previous floating- 
point pipeline stages (except for the FPU buffer) become unavailable. For example, when 
a double-precision multiply (or single- or double-precision divide) instruction moves from 
the FPU decode stage into the FPU execute stages, no other FPU instruction may enter the 
FPU decode stage until that double-precision instruction has moved out of the FPU execute 
stages into the writeback stage. 

7.2.3 Instruction Prefetch Timing 

The timing of the prefetch mechanism on the MPC601 depends heavily on the state of the 
on-chip cache. There are two factors that determine how quickly the cache responds to the 
BPU's request for additional instructions: 

• Is the cache currently serving a previous request? 

• Is the instruction being asked for in the on-chip cache (cache hit) or will a memory 
transaction need to be initiated to bring the data into the cache (cache miss)? 

These issues are discussed further in the following sections. 

7.2.3.1 Caciie Arbitration 

When the branch processing unit attempts to prefetch instructions from the on-chip cache, 
the cache may or may not be able to immediately respond to the request. There are four 
scenarios that may be encountered by the BPU when it requests instructions from the on- 
chip cache. 

The first scenario is when the on-chip cache is idle and a request comes in from the BPU 
for additional instructions. In this case, the on-chip cache responds with the requested 
instructions on the next clock cycle. 
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The second scenario occurs if at tiie time the BPU requests instructions, the on-chip cache 
is busy. A busy state may occur due to accesses in progress to/from memory or when 
snooping cache change activity is in progress. When this case arises, the on-chip cache may 
be inaccessible for one or two clock cycles, depending on the exact state of the memory 
access that is in progress. 

The third scenario occurs if the integer unit has any pending data access operations. In this 
case, priority is always given to the pending lU data accesses. As a result, the BPU may see 
a delay in the response to its request for instructions. In addition, if one of these pending 
data access operations will cause a cache miss, one of the previously described scenarios 
may also occur. 

Note that if the on-chip cache is servicing a previous access that results in a cache hit, no 
delay is seen by the BPU. 

7.2.3.2 Cache Hit 

Assuming that the branch processing unit gains control of the on-chip cache and the 
instructions it needs are in the on-chip cache (a cache hit has occurred), there will only be 
one clock cycle between the time that the BPU requests the instructions and the time that 
the instructions enter the IQ. As previously stated, any number of instructions between one 
and eight can be simultaneously fetched from the on-chip cache and loaded into the IQ. 

Figure 7-3 shows a brief example of an instruction prefetch that hits in the on-chip cache 
and how that prefetch affects instruction issue. In this example, eight instructions are fed 
into the IQ during clock cycle 0. 

During clock cycle 1 , instruction is decoded and instruction 1 is fed into the floating-point 
buffer. Notice that the branch instruction (instruction 4) is still not within the bottom four 
elements of the IQ; thus it may not begin its decode/execute phase. 

During clock cycle 2, another integer instruction and floating-point instruction are pulled 
from the IQ. In addition, the branch instruction is now within the bottom four elements of 
the IQ, thus it may be pulled out of the IQ into the branch pipeline. Notice that the branch 
pipeline has a combined decode/execute stage. The BPU is immediately able to determine 
that the branch will indeed change program flow and sends a request to the on-chip cache 
for the new instruction stream. 

During clock cycle 3, the new instructions arrive in the IQ. Note that instructions 5, 6, and 
7 are never decoded and are discarded (because of the taken branch) when the new set of 
instructions is brought into the IQ. 

During clock cycles 4 through 8, the appropriate instructions move through the various 
pipelines toward completion. As the IQ is emptied into the individual execution unit 
pipelines, additional instructions will be requested from the on-chip cache. 
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Figure 7-3. Instruction Timing — Cache Hit 
7.2.3.3 Cache Miss 

Assuming that the BPU gains control of the on-chip cache and the instructions that it needs 
are not in the on-chip cache (a cache miss has occurred), there will only be seven clock 
cycles between the time that the BPU requests the instructions and the time that the 
instructions are available for decode. These seven clock cycles do not take into account any 
wait states that may be present in the memory system. 

Figure 7-4 shows a brief example of an instruction prefetch that misses in the on-chip cache 
and how that prefetch affects the instruction issue. In this example, eight instructions are 
fed into the IQ during clock cycle 0. 

During clock cycle 1 , instruction is decoded and instruction 1 is fed into the floating-point 
buffer. Notice that the branch instruction (instruction 4) is still not within the bottom four 
elements of the IQ, thus it may not begin its decode/execute phase. 



7-8 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 




Figure 7-4. Instruction Timing — Cache iVIiss 

During clock cycle 2, another integer instruction and floating-point instruction are pulled 
from the IQ. In addition, the branch instruction is now within the bottom four elements of 
the IQ, thus may be pulled out of the IQ into the branch pipeline. Notice that the branch 
pipeline has a combined decode/execute stage. The BPU is immediately able to determine 
that the branch will indeed change program flow and sends a request to the on-chip cache 
for the new instruction stream. 

During clock cycle 3, the on-chip cache misses the access and determines that a memory 
access will have to occur. During clock cycle 4 the address of the block of instructions is 
applied to the system bus. During clock cycle 5 two instructions (64 bits) are returned from 
memory. The instructions are not fed directly into the on-chip cache as they are received 
from the memory system, but they are buffered into groups of 128 bits. 

During clock cycle 7, the first 128 bits of instructions have been received and a request is 
placed to the on-chip cache for access, in order to actually update the cache with the new 
instructions. Also during clock cycle 7, the third pair of instructions is being received from 
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memory. During clock cycle 8, the request for access to the on-chip cache is acknowledged 
and the first four instructions are fed into the on-chip cache and into the IQ, as required. 
Also during clock cycle 8, the fourth pair of instructions is received from memory. 

During clock cycle 9, another request for access is made to the on-chip cache to update the 
cache with the second four instructions received from memory. Also during clock cycle 9, 
instructions 8 and 9 are pulled from the IQ into the appropriate execution units. 

During clock cycle 10, the request for access to the on-chip cache is acknowledged and the 
second four instructions are fed into the on-chip cache and into the IQ, as required. Also 
during clock cycle 10, two more instructions are pulled from the IQ. 

During clock cycle 11, instructions 12-15 are fed into the IQ. During the following clock 
cycles, these instructions move through the appropriate pipelines toward completion. 

7.2.4 Instruction Decode Timing 

Most instructions can be decoded in one clock cycle on the MPC601 . In addition, recall that 
the BPU may decode and execute all instructions in a single clock cycle. Although an 
instruction may be decoded in one clock cycle, other factors may keep the instruction from 
moving to the next stage in the pipeline. Those factors include dependencies on source 
operands, dependencies on registers being available to act as the instruction's destination, 
and the data type of the operands. 

If some dependency exists that may preclude an instruction from beginning execution, that 
instruction may stall in a different stage of its pipeline depending on the type of instruction. 
For example, if it is a floating-point instruction, the instruction is held in the decode stage 
of the floating-point pipeline. If the data that the floating-point operation depends upon is 
returned via a cache access, the decode may begin during the same clock cycle that the 
floating-point register file is being updated. However, if the data that the floating-point 
operation depends upon is returned via the result of a previous floating-point operation, 
then the decode will begin the clock cycle after the floating-point register file is updated. 

If the instruction that has a data dependency is an integer instruction, the instruction is fully 
decoded and may be moved into the execute stage of the integer pipeline where it will wait 
for the source data to become available. The integer instruction will begin execution during 
the same clock cycle as the update to the general-purpose register file. 

If a flow-control operation has a data dependency on the condition register, the instrucfion 
will be predicted during the decode/execute phase in the BPU. 

If the instruction is a floating-point multiply operation with double-precision operands, 
then that instruction will spend a minimum of two clock cycles in the decode stage of the 
floating-point pipeline. 
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7.2.4.1 Source Register Considerations 

If an instruction attempts to use a source operand that is still being computed by a previous 
instruction, a data dependency exists. When a data dependency exists, instruction 
acceptance into that execution unit is halted until all of the necessary source data is 
available. The MPC601 uses a hardware mechanism to keep track of which registers are 
available for use. 

Data access instructions follow a unique set of rules for out-of-order issue and completion. 
Only one memory access instruction can be issued per clock cycle; however, store 
instructions may be issued before the data being stored is available. This allows continued 
issuance and execution of other instructions in parallel with the computation of the source 
data for the store operation. For example, assume a floating-point divide that will update 
register 5 begins execution during clock cycle 0. The results for that divide operation (fr5) 
will not be available for other instructions to use for a number of clock cycles. Now assume 
that a store of fr5 to address Ox' 1000' immediately follows the divide operation in the 
instruction stream. Rather than stalling instruction processing while the store operation 
waits for fr5 to become available, the address of the store is calculated and the store is 
moved into a write buffer. By removing the waiting store from the integer execution 
pipeline, the lU can process additional instructions while the store waits in the write buffer 
for fr5 to become available. 

Note that address operands for store operations are vulnerable to source register checks. For 
example, assume a load that will update register 5 begins execution during clock cycle 0. 
The results for that load operation (r5) will not be available for other instructions to use for 
a number of clock cycles. Now assume that a store of r7 to the address contained in r5 
immediately follows the load operation in the instruction stream. Rather than issuing the 
store into a write buffer, the store operation will stall in the lU's execute stage. In other 
words, for a store to be moved into the write buffer, the address calculation must be 
complete. The address calculation is not able to complete unless all address source 
operands are available. 

Additionally, load instructions may bypass store instructions that are pending as long as the 
address being accessed by the load instruction does not match that being accessed by any 
pending store instruction. 

7.2.4.2 Destination Register Considerations 

The following paragraphs describe how the MPC601 prevents destination registers from 
being overwritten by out-of-sequence instructions and how instructions are prioritized for 
writing back to the register files. 

In a machine that allows instructions to issue, execute, and complete out of order, there is 
the potential for an instruction's result to be overwritten by an instruction that issued later, 
but appears earlier in the instruction stream. Consider the following example: given two 
instructions, instruction_a and instruction_b, where instruction_a occurs before 
instruction_b in the instruction stream. Now, assume that instruction_a is decoded and 
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begins execution during clock cycle 0. Instruction_a completes execution and is ready to 
update general purpose register 30 on clock cycle 6. Assume that instruction_b is decoded 
and begins execution during clock cycle 2. Instruction_b completes execution in a single 
clock cycle and is ready to update general purpose register 30 during clock cycle 3. In this 
case, general purpose register 30 is updated incorrectly. 

To preclude this possibility, a register interlock mechanism is employed that guarantees that 
all source and destination registers are read from and written to in proper order. This 
ensures that updates to any given register are always completed in the order specified by 
the program and thus no data is ever incorrectly overwritten in the register files. If an 
instruction is in execute, it does not move into the writeback stage until its destination 
register is guaranteed not to be the destination register for a previously issued, but 
incomplete, instruction. 

7.2.5 Instruction Execute Timing 

Assuming that an instruction has completed its decode stage, and that the required execute 
stage is available, it should be forwarded into an execute stage. There are additional factors 
that must be considered in calculating when and how an instruction moves from its decode 
stage into its execute stage. First, it is possible that the specified execution unit may not be 
available for any additional instructions. In addition, if an instruction happens to stall in the 
IQ, it is possible that the following instructions may bypass the stalled instruction and begin 
execution. This is known as out-of-order instruction issue. 

Both topics are discussed further in Sections 7.2.5.1 , "Execution Unit Considerations," and 
7.2.5.2, "Out-of-Order Instruction Issue." 

7.2.5.1 Execution Unit Considerations 

As previously noted, the MPC601 is capable of issuing three instructions per clock cycle. 
One of the hindrances in maintaining this peak is the availability of execution units on each 
clock cycle. 

For an instruction to be issued, the required execution unit must be available, or have an 
available spot in its buffer. The sequencer monitors the availability of all execution units 
and suspends instruction issue if the required execution unit is not available. An execution 
unit may not be available under the following circumstances: 

• An execution unit may become unavailable for additional instructions if its pipeline 
becomes full. This situation may occur if execution takes more clock cycles than the 
number of pipeline stages in the unit and additional instructions are issued to that 
unit to fill the remaining pipeline stages. 

• Execution units can accept only one instruction per clock. 

It is important to note that both the integer unit and the floating-point unit each contain a 
one-entry buffer that help reduce the effects of long-latency instructions. Even if a specific 
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stage of one of these pipelines becomes busy for multiple clock cycles, these buffers may 
continue to accept instructions from the IQ. 

7.2.5.2 Out-of-Order Instruction Issue 

As previously mentioned, integer instructions may only be issued from the IQ through 
element 0. Also mentioned was the fact that floating-point instructions may be issued from 
any of the IQ elements 0-3. Since the different execution units are able to pull instructions 
from the IQ through different elements, it is possible for one execution unit to pull an 
instruction from the IQ and begin execution while a previous instruction remains held in 
the IQ. Figure 7-5 shows an example of out-of-order instruction issue on the MPC601. 
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Figure 7-5. Instruction Timing— Out-of-Order Execution 

On clock in Figure 7-5, eight instructions are received from the on-chip cache. During 
clock cycle 1 , instructions and 2 are pulled from the IQ while instruction 1 remains. Note 
that the Load Word and Zero (Iwz) instruction issues to the integer unit while the Floating- 
Point Add (fadd) instruction issues to the floating-point unit; thus these two instructions 
can begin processing during the same clock cycle. 

During clock cycle 2, two more instructions are pulled from the IQ. During the execution 
of instruction on clock 2, a request is sent to the on-chip cache for the required data. 

On clock 3, instructions 1 and 3 have been decoded and are ready to begin execution. 
Unfortunately, instruction 1 has a data dependency on instruction 0, which has not yet 
completed. For this reason, instruction 1 cannot begin execution on this clock cycle. 
Instruction 1 will wait in the lU until its data dependency is resolved. Notice, however, that 
instruction 2 begins execution during clock cycle 3 even though instruction 2 occurs after 
instruction 1 in the instruction stream. 

On clock cycle 4, instruction completes and data is being written back into the general- 
purpose register file while simultaneously being forwarded to the waiting instruction 1. As 
its source data is fed to it, instruction 1 is able to immediately begin execution. 
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Notice that instruction 1 appears to be in the execute stage of the integer unit for two clock 
cycles. However, this is only because instruction 1 will move into the execute stage during 
clock cycle 3 while it waits for its source data. No execution of instruction 1 occurs during 
clock cycle 3. 

Floating-point instructions are able to issue out of order with respect to integer instructions 
and flow-control instructions. Integer instructions cannot issue out of order with respect to 
any other instructions. Flow-control instructions (issued to the BPU) are allowed to issue 
out of order with respect to both integer instructions and floating-point instructions, but not 
with respect to other flow-control instructions. 

7.2.6 Writeback Timing 

There are two writeback buses available for each register file on the MPC601. It is possible 
for more than one instruction to write to the same register file in a given clock cycle. For 
example, a load into the general register file may write its results during the same clock as 
a single-cycle integer instruction. Both of these instructions will require a separate 
writeback bus into the general register file. 

Each of the execution units independentiy handles data that is being written back. In the 
integer unit, if a subsequent integer instruction is waiting for data, it is forwarded past the 
register file directiy into the appropriate execution unit for the immediate execution of the 
waiting instruction. 

In the floating-point unit, if a subsequent floating-point instruction is waiting for data, that 
instruction must wait for the data to be written into the floating-point register file. On the 
next clock cycle, the following instruction may begin decode. In other words, the FPU is 
not equipped with a feed-forwarding mechanism. The exception to this point is when a 
floating-point instruction is waiting for data from a floating-point load operation. In this 
case, the floating-point operation may begin decode during the same clock cycle as the 
floating-point register file is being updated. 

7.3 Execution Unit Timings 

The following sections describe instruction timing within each of the respective execution 
units in the MPC601 . All timings described are only accurate to within a half-clock cycle. 

7.3.1 Branch Processing Unit Execution Timing 

Flow-control operations (conditional branches, unconditional branches, and traps) are 
typically expensive to execute in most machines because they disrupt normal flow in the 
instruction stream. When a change in program flow occurs, the IQ must be reloaded with 
the target instruction stream. During this time, bubbles can be introduced into the execution 
units. However, previously issued instructions will continue to execute while the new 
instruction stream makes its way into the IQ. 
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Performance features such as branch folding and static branch prediction help minimize the 
penalties associated with flow-control operations on the MPC6()1. The timing for branch 
instruction execution is determined by many factors including the following: 

• Whether the branch is taken 

• Whether the target instruction stream is in the on-chip cache 

• Whether the branch can be predicted 

• Whether the prediction is correct 

7.3.1.1 Branch Folding 

When a branch instruction is encountered in the bottom four elements of the IQ, the 
MPC601 immediately tries to pull that instruction out of the IQ and resolve it. When the 
branch processing unit pulls the branch instruction out of the IQ, the instruction above the 
branch is shifted down to take the place of the removed branch. The technique of removing 
the branch instruction from the instruction sequence seen by the other execution units, is 
known as branch folding. 

Often, branch folding reduces the penalties of flow-control instructions to zero since 
instruction execution proceeds as though the branch was never there. 

If the folded branch instruction changes program flow (the branch is said to be "taken" in 
this case), the BPU immediately requests the instructions at the new target from the on-chip 
cache. In most cases, the new instructions arrive in the IQ before any bubbles are 
introduced into the execution units. If the folded branch does not change program flow (the 
branch is said to be "not taken" in this case), the branch is already removed from the 
instruction stream and execution continues as if there were never a branch in the original 
sequence. 

7.3.1.2 Static Branch Prediction 

Static (compiler-directed) branch prediction is a mechanism by which software (for 
example, compflers) can give a hint to the machine hardware about the direction the branch 
is likely to take. When a branch instruction encounters a data dependency, the BPU waits 
for the required condition code to become available. Rather than stalling instruction issue 
until the source operand is ready, the MPC601 predicts which path the branch instruction 
is likely to take, and instructions are fetched and executed along that path. When the branch 
operand becomes available, the branch is evaluated. If the predicted path was correct, 
program flow continues along that path uninterrupted; otherwise, the processor backs up, 
and program flow resumes along the correct path. 

There is a scenario v/here a flovz-control instruction will not be predicted on the MPC601. 
If the target address of the branch (link register) will be modified by an instruction that 
appears before the branch instruction, the branch processing unit must wait until the target 
address is available. 
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The MPC601 may only execute in one level of prediction. In other words, the 
microprocessor may not predict a branch if a prior branch instruction is still unresolved. 
Additionally, while executing down a predicted path, no data or instruction accesses are 
allowed to execute if the access must go off-chip. 

The number of instructions that can be executed conditionally after the issue of a predicted 
branch instruction is limited by the fact that no conditionally executed instruction may 
actually update the register files or memory. That is, instructions may be issued and 
executed conditionally, but may not reach the writeback stage of their pipelines. When a 
conditionally issued instruction has completed execution, it will not be moved into the 
writeback stage, instead, it will simply stall in the last execute phase of that execution unit. 
This means that the execution units may become full, which will limit the number of 
additional instructions that may be issued conditionally. 

In the case of a misprediction, the MPC601 is able to reverse its machine state rather 
painlessly because the programing model has not been updated. When a branch is found 
out to be mispredicted, all instructions that were issued conditionally are simply flushed 
from the execution unit pipelines. No register state needs to be restored because no register 
state was modified conditionally. 

7.3.1.2.1 Predicted "Not Taken" Branch Timing Examples 

Figure 7-6 depicts the case where branch instructions are predicted to be not taken. During 
clock cycle 0, eight instructions arrive into the IQ. During clock cycle 1 , instructions 0, 1, 
and 2 are pulled from the IQ and into their respective execution units. Notice that the BPU 
has a combined decode/execute stage, thus the branch (instruction 1) is predicted not to be 
taken during clock cycle 1 because its source register (condition register) is not available. 
The branch is predicted because its source data is not yet available. 

During clock cycle 2, instructions and 2 progress through their pipelines. In addition, the 
branch (instruction 1) remains predicted. Notice that the next branch instruction 
(instruction 5) is not able to begin its decode/execute phase while instruction 1 is predicted. 

During clock cycle 3, instruction begins its writeback stage. The writeback of instruction 
resolves the data dependency for the first branch (instruction 1); thus the first branch 
becomes resolved and it is determined that the prediction was correct. Recall that only one 
branch may be predicted at a time; thus when instruction 1 is resolved the BPU is free to 
predict instruction 5. 

During clock 4, the second branch instruction remains predicted while additional 
instructions move through the various pipelines. 

During clock cycle 5, the BPU realizes that the prediction made for instruction 5 was 
incorrect. Note that since instruction 6 was issued and executed conditionally, it never 
performed its writeback. As a result of the misprediction, all instructions that followed the 
branch in the instruction stream must be flushed from the respective execution unit 
pipelines. Notice that instructions 6 and 7 do not continue execution since it has been 
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determined that these instructions should have never been issued in the first place. Since 
the branch has been resolved, a request is sent to the on-chip cache for the new instruction 
stream (based on the execution of instruction 5). During clock 6, the new set of instructions 
are in the IQ and the appropriate decoding begins on clock cycle 7. 
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Figure 7-6. Instruction Timing— Brancli Not Tal^en 

7.3.1 .2.2 Predicted "Taken" Branch Timing Examples 

Figure 7-7 depicts the case where branch instructions are predicted to be taken. During 
clock cycle 0, eight instructions are fed into the IQ. During clock cycle 1, the first branch 
(instruction 1) is decoded and executed (recall that the branch execution unit has a 
combined decode/execute stage). Note that as the branch is predicted during clock cycle 1, 
a request is sent to the on-chip cache for the new instruction stream. Also during clock cycle 
1, all subsequent instructions are not processed any further. Notice that these instructions 
are not discarded, they are simply not processed. 
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Figure 7-7. Instruction Timing — Brancli Tal<en 

During clock cycle 2, instruction is executed and the branch (instruction 1) is resolved. It 
turns out that this branch was predicted incorrectly. As a result, the instructions that are 
being received from the on-chip cache are discarded. Additionally, processing begins again 
on the subsequent instructions in the IQ. 

It is important to note that the MPC601 does not discard instructions 2-7 in this example. 
The processor does not discard instructions until the new instructions have been received 
from the on-chip cache. This helps in the case of mispredictions (as shown here). 

During clock cycle 3, the next branch (instruction 5) has fallen into one of the bottom 4 
positions in the IQ, and thus, can be pulled out of the IQ into the BPU. The branch may not 
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be resolved immediately, and is predicted to be taken. As a result, processing stops on all 
subsequent instructions while processing continues on all previous instructions 
(instructions 2, 3, and 4). Additionally, a request is sent to the on-chip cache for the new 
instruction stream. 

During clock cycle 4, the new instructions arrive in the IQ, which forces instructions 6 and 
7 to be discarded. During clock 5, the second branch (instruction 5) is resolved and it is 
determined that the prediction was correct. As a result, instruction decode and execute 
continues. 

7.3.2 Integer Unit Execution Timing 

The integer unit executes all integer, bit-field, and data access instructions. Many of these 
instructions execute in a single clock cycle. The integer unit has one execute phase in its 
pipeline, thus when a multi-cycle integer instruction is being executed, no other integer 
instructions may begin an execute phase. Although a multi-cycle integer instruction may 
block the integer execute phase, it does not preclude subsequent integer instructions from 
being decoded. In addition, there is a one-entry buffer between the integer decode stage (IQ 
0) and the integer execute stage. If the execute stage of the integer unit is available, then a 
decoded instruction will fall through the buffer into the execute stage. 

The single execute stage in the integer unit is used differently for the single-cycle and 
multi-cycle integer instructions. For example, single-cycle integer instructions move 
through the execute stage in one clock cycle, as do data access instructions, which use the 
execute stage of the integer unit for address calculation. For multi-cycle integer instructions 
(such as, mul), the same execute stage is used over and over until the multi-cycle 
instruction has completed execution. Figure 7-8 illustrates the instruction flow of the 
integer unit. 
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Figure 7-8. Integer Unit Instruction Flow 
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7.3.2.1 Integer Instructions Timing Examples 

Figure 7-9 illustrates the timing of the integer unit while executing a sequence of integer 
instructions. 
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Figure 7-9. Instruction Timing — Integer Instructions 

Notice that each integer instruction takes only one clock cycle to decode. After the decode 
stage, the integer instruction moves into the execute phase of the integer unit pipeline. 
Notice, however, that the mul instruction uses five clock cycles (clocks 4 - 8) in the execute 
phase of the integer unit. As a result, during clock 5, the subsequent integer instruction 
cannot move into the execute phase. Instead it is held in the lU buffer as it passes from the 
decode stage (IQ-0) until the execute phase of the integer unit becomes available (during 
clock cycle 9). Although instruction 3 may not enter the execute phase (because that phase 
is still being used by instruction 2) on clock cycle 5, instruction 4 may enter the decode 
stage. 

During clock cycle 9, the mul (instruction 2) instruction moves from the execute stage into 
the writeback stage. As a result, instruction 3 is able to move into the execute stage and 
instruction 4 moves into the buffer since it has already been decoded. After the mul 
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instruction is moved out of the execute phase of the pipeline, the single-cycle pipeline 
continues. 

It is important to note that if the integer unit buffer was not present, the timing of the 
instructions shown, with respect to when their writeback stages occur, would be the same. 
The real value of the buffer in the integer unit is that it allows instructions to be pulled out 
of the IQ even when the execute stage in the integer unit is busy. This in turn allows new 
instructions to enter the critical bottom four entries and possibly be decoded/executed by 
the floating-point unit or the branch processing unit. 

7.3.2.2 Data Instructions Timing Examples 

As previously mentioned, the lU also does the address calculation for all data access 
instructions. Once the address calculation is complete (in one clock cycle), the data access 
instruction is moved to another stage, thus allowing additional integer instructions to be 
pushed through the integer unit. 

Figure 7-10 illustrates the timing of a data access instruction as it moves through the lU. 
During clock cycle 0, the instructions arrive into the IQ. During clock cycle 1, instruction 
is decoded while the subsequent instructions wait in the IQ, During clock cycle 2, 
instruction begins its execute phase while instruction 1 is decoded. 

During clock cycle 3, instruction 2 is decoded while instruction is writing results back 
into the general purpose register file. Also during clock cycle 3, instruction 1 is executed. 
During the execution of instruction 1, the effective address is calculated and a request is 
sent to the on-chip cache for the data needed. 

On clock cycle 4, the on-chip cache is being accessed and data is being returned. While this 
access occurs, the execute phase of the lU becomes available to service instruction 2, Note 
that even if a cache miss occurred for this data access instruction, the execute phase of the 
lU would become available after only one cycle was used to calculate the effective address. 

As load and store operations move from the integer unit execution stage, they move to the 
cache. If a cache miss occurs, they are inserted in a buffer. There is a separate write buffer 
and read buffer. If a cache miss results from a load operation, that load instruction will 
remain in the read buffer until data is returned from memory and the cache and register file 
are updated. While this load is pending, additional load operations may access the cache. 
However, if a load access misses the cache while the load buffer is still holding a previously 
issued load, then the integer unit may experience a stall. If a cache miss results from a store 
operation, that store instruction will remain in the write buffer until the data has been 
properly stored. While this store is pending, additional store operations may access the 
cache, and may also be placed in the store buffer behind the original store if a cache miss 
occurs. 
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Figure 7-10. Instruction Timing — Data Instructions 

Once the load buffer becomes full, additional load operations which miss the cache, may 
have to wait in the integer unit execution stage for a read buffer entry to become available. 
In addition, once the three-entry write buffer becomes full, additional store operations 
which miss the cache, may have to wait in the integer unit execution stage for a write buffer 
entry to become available. 

For timing information on cache misses, refer to Section 7.2.3.3, "Cache Miss," which 
describes instruction fetches that miss in the on-chip cache. 

7.3.3 Floating-Point Unit Execution Timing 

The floating-point unit on the MPC601 executes all floating-point instructions with the 
exception of the floating-point load and store operations. For these instructions, it is the 
integer unit that does the effective address calculation, but a single clock cycle is needed 
from the FPU in order to move the data in or out of the floating-point register file. The 
timing of floating-point load and store instructions with respect to the FPU is discussed 
further in the following paragraphs. 
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The FPU has only two execute phases in its pipeline. There are some floating-point 
operations that need more than two clock cycles of execute time. In the case where the FPU 
must execute these longer latency instructions, both execute phases and the decode phase 
may be used for multiple clock cycles. Figure 7-11 illustrates the pipeline of the floating- 
point unit. Notice the secondary path out of the first and second execute phases that allow 
repeated use of these stages by the same instruction. 
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Figure 7-11. Floating-Point Unit Instruction Flow 

In fact, all double-precision multiply instructions will spend multiple clock cycles in the 
execute phases of the FPU. When a double-precision multiply instruction is encountered, 
it will spend a minimum of two clock cycles in each phase of the FPU pipeUne. That is, it 
will spend two clock cycles in the decode phase, a minimum of two clock cycles in FPU 
execute 1, and a minimum of two clock cycles in FPU execute 2. 

When a double-precision floating-point multiply instruction enters the FPU pipeline, 
previous floating-point pipeline stages (except for the FPU buffer) become unavailable. For 
example, when a double-precision multiply instruction moves from the FPU decode stage 
into the FPU execute stages, no other FPU instruction may enter the FPU decode stage until 
the multiply has moved out of the FPU execute stages into the writeback stage. 

7.3.3.1 Floating-Point Instructions Timing Examples 

The following paragraphs describe a sequence of the floating-point instructions as they pass 
through the various stages of the FPU. 
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Figure 7-12 illustrates an example of the sequence of floating-point instructions. In clock 
0, tiie eight instructions are fed into the IQ. During clock cycle 1 instruction is pulled from 
the IQ into the floating-point buffer. 
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Figure 7-12. Instruction Timing— Floating-Point Instructions 

During clock 2, instruction is sent into the floating-point decode stage, which makes room 
for instruction 1 to be fed into the floating-point buffer. Instructions and 1 continue to flow 
through the floating-point pipeline until they complete execution and write back their 
results into the floating-point register file. 

Instruction 2 has a data dependency on instruction 1 . In other words, one of the operands 
of instruction 2 is the result of instruction 1 . For this reason, instruction 2 may not proceed 
through the floating-point pipeline right behind instruction 1. Instruction 2 is held in the 
floating-point decode stage during clocks 4-7 while instruction 1 completes execution. It 
is during clock cycle 6 that instruction 1 completes execution and updates the floating-point 
register file with its results. However, since there is no feed forwarding mechanism in the 
floating-point unit on the MPC601 , instruction 2 must wait until clock 7 before it may begin 
its decode stage. Notice that as instruction 2 is held in the decode stage, instruction 3 is 
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allowed to move into the floating-point buffer. This allows the IQ to be shifted down and 
possibly reveal a branch instruction within the critical bottom 4 elements of the IQ. 

During clock cycle 8, instruction 2 begins its execute phase, instruction 3 begins its decode 
stage, and instruction 4 enters the floating-point buffer. The following clock cycles show 
that the subsequent instructions flow through the FPU in a fully pipelined manner. Although 
it is not shown here, the fdiv operation (instruction 6) will tie up the execute phases of the 
FPU for 1 4 clock cycles. During this time, instruction 7 will be waiting in the floating-point 
decode stage until the execute phase becomes available. 

7.4 Memory Performance Considerations 

When instruction throughput is capable of three instructions per clock cycle, lack of data 
bandwidth can become a performance bottleneck. In order for the MPC601 to approach its 
potential performance levels, it must be able to read and write data quickly and efficiently. 
If there are many processors in a system environment, one processor may experience long 
memory latencies while another bus master (for example, another processor or a direct 
memory access controller) is using the external bus. 

In order to alleviate this possible contention, the MPC601 provides three memory update 
modes: copy-back, write-through, and cache-inhibit. Each page of memory is specified to 
be in one of these modes. If a page is in copy-back mode, data being stored to that page is 
written only to the on-chip cache. If a page is in write-through mode, writes to that page 
update the on-chip cache on hits and always update main memory. If a page is cache- 
inhibited, data in that page will never be stored in the on-chip cache. All three of these 
modes of operation have advantages and disadvantages. A decision as to which mode to use 
depends on the system environment as well as the application. 

This section describes how performance is impacted by each memory update mode. For 
details about the operation of the on-chip cache and the memory update modes, see 
Chapter 4, "Cache and Memory Unit Operation." 

7.4.1 Copy-Back Mode 

When storing data while in copy-back mode, store operations for cacheable data do not 
necessarily cause an external bus cycle to update memory. Instead, memory updates only 
occur on line replacements, cache flushes, or when another processor attempts to access a 
specific address for which there is a corresponding dirty cache entry. For this reason, copy- 
back mode may be preferred when external bus bandwidth is a potential bottieneck — ^for 
example, in a multiprocessor environment. Copy-back mode is also well suited for data that 
is closely coupled to a processor, such as local variables. 

If more than one processor uses data stored in a page that is in copy-back mode, snooping 
must be enabled to allow copy-back operations and cache invalidations of modified data. 
The MPC601 implements snooping hardware to prevent other devices from accessing 
invalid data. When bus snooping is enabled, the processor monitors the transactions of the 
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other devices. For example, if another device accesses a memory location, the MPC601 on- 
chip cache has a modified value for that address, and the memory-coherent (M) bit 
corresponding to that page is set, the processor pre-empts the bus transaction, and updates 
memory with the cache data. The other device is then free to attempt an access to the 
updated memory address. See Chapter 4, "Cache and Memory Unit Operation," for 
complete information on bus snooping. 

Copy-back mode provides complete cache/memory coherency as well as maximizing 
available external bus bandwidth. 

7.4.2 Write-Through Mode 

Store operations to memory in write-through mode always update memory as well as the 
on-chip cache (on cache hits). Write-through mode is used when the data in the cache must 
always agree with external memory (for example, video memory), or when there is shared 
(global) data that may be used frequently, or when allocation of a cache line on a cache miss 
is undesirable. Automatic copy back of cached data is not performed if that data is from a 
memory page marked as write-through mode since valid cache data always agrees with 
memory. 

Stores to memory that is in write-through mode may cause a decrease in performance. Each 
time a store is performed to memory in write-through mode, the bus will be busy for the 
extra clock cycles required to perform the memory update; therefore, pending load 
operations that miss the on-chip cache must wait while the external store operation 
completes. In addition, since the on-chip cache is shared for both instructions and data, any 
pending instruction fetches from the on-chip cache may also see an undesired latency. 

7.4.3 Cache-Inhibited Accesses 

If a memory page is specified to be cache-inhibited, data from this page will not be stored 
in the on-chip cache. 

Areas of the memory map can be cache-inhibited by the operating system software. If a 
cache-inhibited access hits in the on-chip cache, the corresponding cache line is 
invalidated. If the line is marked as modified, it is copied back to memory before being 
invalidated. 

In summary, the copy-back mode allows both load and store operations to use the on-chip 
cache. The write-through mode allows load operations to use the on-chip cache, but store 
operations cause a memory access and a cache update if the data is already in the cache. 
Lastly, the cache-inhibited mode causes memory access for both loads and stores. 

7.5 Instruction Latency Summary 

Table 7-1 lists the latencies associated with each instruction executed by the MPC601 . Note 
that Table 7-1 contains no 64-bit architected instructions. These instructions will trap to an 
illegal instruction exception handler when encountered. Recall that the term latency is 
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defined as the total time it takes to execute an instruction and make ready the results of that 
instruction. 

As previously stated, the FPU has no feed-forwarding capabilities. In other words, as a 
floating-point operation completes, another floating-point instruction that may be waiting 
for those results must wait for the data to be written into the register file before decode can 
begin. This extra time is accounted for in Table 7-1. 

Table 7-1. MPC601 Instruction Latencies 



Mnemonic 


Instruction 


Latency 
(Clocks) 


1 

Execution 
Unit 


abs 


Absolute 




lU 


add[o][.] 


Add 




lU 


addc[o][.] 


Add Carrying 




lU 


adde[o][.] 


Add Extended 




lU 


addi. 


Add Immediate 




lU 


addic 


Add Immediate Carrying 




lU 


ad die. 


Add Immediate Carrying and Record 




lU 


addis 


Add Immediate Shifted 




iU 


addme[o][.] 


Add to Minus One Extended 




lU 


addze[o][.] 


Add to Zero Extended 




IU 


and[.] 


AND 




IU 


andc[.] 


AND with Complement 




IU 


andi. 


AND Immediate 




IU 


andis. 


AND Immediate Shifted 




IU 


b[l][a] 


Branch 




BPU 


bc[l][al 


Branch Conditional 




BPU 


bcctr[l] 


Branch Conditional to Count Register 




BPU 


bclr[l] 


Branch Conditional to Link Register 




BPU 


cmp 


Compare 




IU 


cmpi 


Compare immediate 




IU 


cmpi 


Compare Logical 




IU 


cmpii 


Compare Logical Immediate 




IU 


cntlzwM 


Count Leading Zeros Word 




IU 


crand 


Condition Register AND 




IU 


crandc 


Condition Register AND with Complement 




IU 


creqv 


Condition Register Equivalent 




IU 
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Table 7-1. MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


crnand 


Condition Register NAND 




lU 


crnor 


Condition Register NOR 




lU 


cror 


Condition Register OR 




lU 


crorc 


Condition Register OR with Complement 




lU 


crxor 


Condition Register XOR 




lU 


debt 


Data Cache Blocl< Flush 




lU 


dcbi 


Data Cache Block Invalidate 




lU 


dcbst 


Data Cache Block Store 




lU 


debt 


Data Cache Block Touch 




lU 


dcbtst 


Data Cache Block Touch for Store 




lU 


dcbz 


Data Cache Block Set to Zero 




lU 


div[o][.] 


Divide 


36 


lU 


divs[o][.] 


Divide Short 


36 


lU 


divw[o][.] 


Divide Word 


36 


lU 


divwu[o][.] 


Divide Word Unsigned 


36 


lU 


cloz[o][.] 


Difference or Zero 




lU 


dozi 


Difference or Zero Immediate 




lU 


eciwx 


External Control Input Word Indexed 


-il 


lU 


ecowx 


External Control Output Word Indexed 


■|1 


lU 


eieio 


Enforce In-Order Execution of I/O 


-ii 


lU 


eqv[.] 


Equivalent 




lU 


extsb[.] 


Extend Sign Byte 




lU 


extsh[.] 


Extend Sign Half Word 




lU 


fabs[.] 


Floating-Point Absolute Value 


4 


FPU 


fadd[.] 


Floating-Point Add 


4 


FPU 


fadds[.] 


Floating-Point Add Single-Precision 


4 


FPU 


tempo 


Floating-Point Compare Ordered 


4 


FPU 


fcmpu 


Floating-Point Compare Unordered 


4 


FPU 


fctiw[.] 


Floating-Point Convert to Integer Word 


4 


FPU 


fctiwzM 


Floating-Point Convert to Integer Word with Round 
toward Zero 


4 


FPU 


fdiv[.] 


Floating-Point Divide 


31 


FPU 
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Table 7-1. MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


fdivsM 


Floating-Point Divide Single-Precision 


17 


FPU 


fmadd[.] 


Floating-Point Multiply-Add 


5 


FPU 


fmadds[.] 


Floating-Point Multiply-Add Single-Precision 


4 


FPU 


fmr[.] 


Floating-Point Move Register 


4 


FPU 


fmsub[.] 


Floating-Point Multiply-Subtract 


5 


FPU 


fmsubs[.] 


Floating-Point Multiply-Subtract Single-Precision 


4 


FPU 


fmul[.] 


Floating-Point Multiply 


5 


FPU 


fmuls[.] 


Floating-Point Multiply Single-Precision 


4 


FPU 


fnabs[.] 


Floating-Point Negative Absolute Value 


4 


FPU 


fneg[.] 


Floating-Point Negate 


4 


FPU 


fnmadd[.] 


Floating-Point Negative Multiply-Add 


5 


FPU 


fnmadds[.] 


Floating-Point Negative Multiply-Add Single-Precision 


4 


FPU 


fnmsub[.] 


Floating-Point Negative Multiply-Subtract 


5 


FPU 


fnmsubs[.] 


Floating-Point Negative Multiply-Subtract Single- 
Precision 


4 


FPU 


fres[.] 


Floating-Point Reciprocal Estimate Single-Precision 


Not 

implemented 

(trap) 





frspM 


Floating-Point Round to Single-Precision 


4 


FPU 


frsqrte[.] 


Floating-Point Reciprocal Square Root Estimate 


Not 

implemented 

(trap) 




fsel[.] 


Floating-Point Select 


Not 

implemented 

(trap) 





fsqrtn 


Floating-Point Square Root 


Not 

implemented 

(trap) 





fsqrtsM 


Floating-Point Square Root Single-Precision 


Not 

implemented 

(trap) 




fsub[.] 


Floating-Point Subtract 


4 


FPU 


fsubs[.] 


Floating-Point Subtract Single-Precision 


4 


FPU 


icbi 


Instruction Cache Block Invalidate 


1^ 


lU 


isync 


Instruction Synchronize 


Serialize 


lU 


Ibz 


Load Byte and Zero 


2 


lU 
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Table 7-1. MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


Ibzu 


Load Byte and Zero with Update 


2 


lU 


Ibzux 


Load Byte and Zero with Update Indexed 


2 


lU 


Ibzx 


Load Byte and Zero Indexed 


2 


lU 


Ifd 


Load Floating-Point Double-Precision 


3 


lU 


Ifdu 


Load Floating-Point Double-Precision with Update 


3 


lU 


Ifdux 


Load Floating-Point Double-Precision with Update 
Indexed 


3 


lU 


Ifdx 


Load Floating-Point Double-Precision Indexed 


3 


lU 


Ifs 


Load Floating-Point Single-Precision 


3 


lU 


Ifsu 


Load Floating-Point Single-Precision with Update 


3 


lU 


Ifsux 


Load Floating-Point Single-Precision with Update 
Indexed 


3 


lU 


Ifsx 


Load Floating-Point Single-Precision Indexed 


3 


lU 


Iha 


Load Half Word Algebraic 


2 


lU 


Ihau 


Load Half Word Algebraic with Update 


2 


lU 


Ihaux 


Load Half Word Algebraic with Update Indexed 


2 


lU 


Ihax 


Load Half Word Algebraic Indexed 


2 


lU 


Ihbrx 


Load Half Word Byte-Reverse Indexed 


2 


lU 


Ihz 


Load Half Word and Zero 


2 


lU 


Ihzu 


Load Half Word and Zero with Update 


2 


lU 


Ihzux 


Load Half Word and Zero with Update Indexed 


2 


lU 


Ihzx 


Load Half Word and Zero Indexed 


2' 


lU 


Imw 


Load Multiple Word 


1 + number of 

registers 

transferred 


lU 


Iscbx 


Load String and Compare Byte Indexed 


1 + number of 

registers 

transferred 


lU 


Iswi 


Load String Word Immediate 


1 + number of 

registers 

transferred 


lU 


Iswx 


Load String Word Indexed 


1 + number of 

registers 

transferred 


lU 


Iwarx 


Load Word and Reserve Indexed 


2 


lU 


Iwbrx 


Load Word Byte-Reverse Indexed 


2 


lU 
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Table 7-1 . MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


_ 

Execution 
Unit 


Iwz 


Load Word and Zero 


2 


lU 


Iwzu 


Load Word and Zero with Update 


2 


lU 


Iwzux 


Load Word and Zero with Update Indexed 


2 


lU 


Iwzx 


Load Word and Zero Indexed 


2 


lU 


maskg[.] 


Mask Generate 


1 


lU 


maskir[.] 


Mask Insert from Register 


1 


lU 


mcrf 


Move Condition Register Field 


2 


lU 


mcrfs 


Move to Condition Register from FPSCR 


2 


lU 


mcrxr 


Move to Condition Register from XER 


2 


lU 


mfcr 


Move from Condition Register 


1 


lU 


mffs[.] 


Move from FPSCR 


4 


lU 


mfmsr 


Move from Machine State Register 


1 


lU 


mfspr 


Move from Special Purpose Register 


Variable 


lU 


mfsr 


Move from Segment Register 


2 


lU 


mfsrin 


Move from Segment Register Indirect 


2 


lU 


mftb 


Move from Time Base 


Not 

implemented 

(trap) 




mtcrf 


Move to Condition Register Fields 


2 


lU 


mtfsbOM 


Move to FPSCR Bit 


4 


lU 


mtfsb1[.] 


Move to FPSCR Bit 1 


4 


lU 


mHsf[.] 


Move to FPSCR Fields 


4 


lU 


mtfsfiU 


Move to FPSCR Field Immediate 


4 


lU 


mtmsr 


Move to Machine State Register 


Serialize 


lU 


mtspr 


Move to Special Purpose Register 


Variable 


lU 


mtsr 


Move to Segment Register 


1 


lU 


mtsrin 


Move to Segment Register Indirect 


1 


lU 


mul[o]I.] 


Multiply 


5/93 


lU 


mulhw[.] 


Multiply High Word 


5 


lU 


mulhwu[.] 


Multiply High Word Unsigned 


5/9/1 0'' 


lU 


mull[o][.] 


Multiply Low 


5 


lU 


mulli 


Multiply Low Immediate 


5 


lU 


nabs 


Negative Absolute 


1 


lU 
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Table 7-1. MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


nandn 


NAND 




lU 


neg[o][.] 


Negate 




lU 


nor[.] 


NOR 




lU 


or[.] 


OR 




lU 


orc[.] 


OR with Complement 




lU 


ori 


OR Immediate 




lU 


oris 


OR Immediate Shifted 




lU 


rfi 


Return from Interrupt 


Serialize 


lU 


rlmi[.] 


Rotate Left then Mask Insert 




lU 


rlwimi[.] 


Rotate Left Word Immediate then Mask Insert 




lU 


rlwinm[.] 


Rotate Left Word Immediate then AND with Mask 




lU 


rlwnm[.] 


Rotate Left Word then AND with Mask 




lU 


rrib[.] 


Rotate Right and Insert Bit 




lU 


sc 


System Call 


Serialize 


lU 


sleM 


Shift Left Extended 




lU 


sleqM 


Shift Left Extended with MQ 




lU 


sliqM 


Shift Left Immediate with MQ 




lU 


slliqM 


Shift Left Long Immediate with MQ 




lU 


sllq[.l 


Shift Left Long with MQ 




lU 


slq[.] 


Shift Left with MQ 




lU 


slw[.] 


Shift Left Word 




lU 


sraq[.] 


Shift Right Algebraic with MQ 




lU 


sraiqn 


Shift Right Algebraic Immediate with MQ 




lU 


sraw[.] 


Shift Right Algebraic Word 




lU 


srawi[.] 


Shift Right Algebraic Word Immediate 




lU 


sre[.] 


Shift Right Extended 




lU 


srea[.] 


Shift Right Extended Algebraic 




lU 


sreqM 


Shift Right Extended with MQ 




lU 


sriqM 


Shift Right Immediate with MQ 




lU 


srliq[.] 


Shift Right Long Immediate with MQ 




lU 


srIqM 


Shift Right Long with MQ 




lU 


srq[.] 


Shift Right with MQ 




lU 
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Table 7-1 . MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


srw[.] 


Shift Right Word 




lU 


stb 


Store Byte 




lU 


stbu 


Store Byte with Update 




lU 


stbux 


Store Byte with Update indexed 




lU 


stbx 


Store Byte Indexed 




lU 


stfd 


Store Fioating-Point Double-Precision 




lU 


stfdu 


Store Floating-Point Double-Precision with Update 




lU 


stfdux 


Store Floating-Point Double-Precision with Update 
Indexed 




lU 


stfdx 


Store Floating-Point Double-Precision Indexed 




lU 


stfiwx 


Store Floating-Point as integer Word Indexed 




lU 


stfs 


Store Floating-Point Single-Precision 




lU 


stfsu 


Store Floating-Point Single-Precision with Update 




lU 


stfsux 


Store Floating-Point Single-Precision with Update 
Indexed 




lU 


stfsx 


Store Floating-Point Single-Precision Indexed 




lU 


sth 


Store Half Word 




lU 


sthbrx 


Store Half Word Byte-Reverse Indexed 




lU 


sthu 


Store Half Word with Update 




lU 


sthux 


Store Half Word with Update Indexed 




lU 


sthx 


Store Half Word Indexed 




lU 


stmw 


Store [Multiple Word 




lU 


stswi 


Store String Word Immediate 




iU 


stswx 


Store String Word Indexed 




lU 


stw 


Store Word 




IU 


stwbrx 


Store Word Byte-Reverse Indexed 




IU 


stwcx. 


Store Word Conditional Indexed 




IU 


stwu 


Store Word with Update 




IU 


stwux 


Store Word with Update Indexed 




IU 


stwx 


Store Word Indexed 




IU 


subf[o][.] 


Subtract from 




IU 


subfc[o][.] 


Subtract from Carrying 




IU 


subfe[o][.] 


Subtract from Extended 




IU 
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Table 7-1. MPC601 Instruction Latencies (Continued) 



Mnemonic 


Instruction 


Latency 
(Clocks) 


Execution 
Unit 


subfic 


Subtract from Immediate Carrying 


1 


lU 


subfme[o][.] 


Subtract from Minus One Extended 


1 


lU 


subfze[o]I.] 


Subtract from Zero Extended 


1 


lU 


sync 


Synchronize 


Serialize bus 
operations 


lU 


tibia 


Translation Lookaside Buffer Invalidate All 


Not 

implemented 

(trap) 


— 


tibie 


Translation Lookaside Buffer Invalidate Entry 


Serialize 


lU 


tibiex 


Translation Lookaside Buffer Invalidate Entry by Index 


Not 

implemented 

(trap) 


— D 


tw 


Trap Word 


12 


lU 


twi 


Trap Word Immediate 


12 


lU 


xor[.] 


XOR 


1 


lU 


xori 


XOR Immediate 


1 


lU 


xoris 


XOR Immediate Shifted 


1 


lU 



^These instructions access the system bus, thus the latency may vary depending on the exact state 
of the machine. 

^These instructions serialize the processor if the trap is taken. 

^he longer latency may occur if the contents of rB is larger than 1 6 bits (not including sign-extend- 
ing bits 

''Shortest latency occurs if rB <= 16 bits. Longer latency occurs if rB > 16 bits, but most significant 
bit is still 0. Longest latency occurs if most significant bit is 1 . 
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Chapter 8 

Signal Descriptions 



This chapter describes the MPC6()1 microprocessor's external signals. It contains a concise 
description of individual signals, showing behavior when the signal is asserted and negated 
and when the signal is an input and an output. 

NOTE 

A bar over a signal name indicates that the signal is active 
low — ^for example, ARTRY (address retry) and TS (transfer 
start). Active-low signals are referred to as asserted (active) 
when they are low and negated when they are high. Signals that 
are not active-low, such as AP()-AP3 (address bus parity 
signals) and TT{)-TT4 (transfer type signals) are referred to as 
asserted when they are high and negated when they are low. 

The MPC601 signals are grouped as follows: 

• Address arbitration signals — The MPC601 uses these signals to arbitrate for address 
bus mastership. 

• Address transfer start signals — ^These signals indicate that a bus master has begun a 
transaction on the address bus. 

• Address transfer signals — These signals, which consist of the address bus, address 
parity, and address parity error signals, are used to transfer the address and to ensure 
the integrity of the transfer. 

• Transfer attribute signals — These signals provide information about the type of 
transfer, such as the transfer size and whether the transaction is bursted, write- 
through, or cache-inhibited. 

• Address transfer termination signals — ^These signals are used to acknowledge the 
end of the address phase of the transaction. They also indicate whether a condition 
exists that requires the address phase to be repeated. 

• Data arbitration signals — ^The MPC601 uses these signals to arbitrate for data bus 
mastership. 

• Data transfer signals — ^These signals, which consist of the data bus, data parity, and 
data parity error signals, are used to transfer the data and to ensure the integrity of 
the transfer. 
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Data transfer termination signals — Data termination signals are required after each 
data beat in a data transfer. In a single-beat transaction, the data termination signals 
also indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the final 
data beat. They also indicate whether a condition exists that requires the data phase 
to be repeated. 

System status signals — ^These signals include the external interrupt signal, 
checkstop signals, and both soft- and hard-reset signals. These signals are used to 
interrupt and, under various conditions, to reset the processor. 

Processor state signals — ^These two signals are used to set the reservation coherency 
bit and set the size of the MPC6()1 's output buffers. 

Miscellaneous signals — ^These signals provide information about the state of the 
reservation coherency bit and the size of the MPC6()rs output buffers. 

COP interface signals — The common on-chip processor (COP) unit is the master 
clock control unit and it provides a serial interface to the system for performing 
built-in self test (BIST). 

Test interface signals — ^These signals are used for internal testing. 

Clock signals — ^These signals determine the system clock frequency. These signals 
can also be used to synchronize multiprocessor systems. 

8.1 Signal Configuration 

Figure 8-1 illustrates the MPC601 microprocessor's pin configuration, showing how the 
signals are grouped. 

NOTE 

A pinout showing actual pin numbers is included in the 
MPC6()1 microprocessor electrical specifications. 
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ADDRESS 
ARBITRATION 

ADDRESS 

TRANSFER 

START 

ADDRESS 
TRANSFER 



BR 



TRANSFER 
ATTRIBUTE -I 



ADDRESS 
TERMINATION -] 



CLOCKS 



B5 




ABB 


" 


- 


IS 






XATS 




— 


A0-A31 
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7^ 




— 


774 
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TEST 






ur 
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Figure 8-1. MPC601 Signal Groups 

8.2 Signal Descriptions 

This section describes individual MPC601 signals, grouped according to Figure 8-1. 
Note that the following sections are intended to provide a quick summary of signal 
functions. Chapter 9, "System Interface Operation," describes many of these signals in 
greater detail, both with respect to how individual signals function and how groups of 
signals interact. 

8.2.1 Address Bus Arbitration Signal 

The address arbitration signals are a collection of input and output signals the MPC601 uses 
to request the address bus, recognize when the request is granted, and indicate to other 
devices when mastership is granted. For a detailed description of how these signals interact, 
see Section 9.3.1, "Address Bus Arbitration." 
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8.2.1 .1 Bus Request (FR)— Output 

The bus request (BR) signal is an output signal on the MPC6()1 . Following are the state 
meaning and timing comments for the BR" signal. 

State Meaning Asserted — Indicates that the MPC601 is requesting mastership of 

the address bus. See Section 9.3.1, "Address Bus Arbitration." 

Negated — Indicates that the MPC601 is not requesting the address 
bus. The MPC601 may have no bus operation pending, it may be 
parked, or the AR TRY input was asserted on the previous bus clock 
cycle. 

Timing Comments Assertion — Occurs when the MPC601 is not parked and a bus 
transaction is needed. This may occur even if the two possible 
pipeline accesses have occurred. 

Negation — Occurs for at least one bus clock cycle after an accepted, 
qualified bus grant (see Wj and ABB), even if another transaction is 
pending. It is also negated for at least one bus clock cycle when the 
assertion of AR TRY is detected on the bus. 

8.2.1.2 Bus Grant (B^)— Input 

The bus grant (BG) signal is an input signal on the MPC6()1. Following are the state 
meaning and timing comments for the BG signal. 

State Meaning Asserted — Indicates that the MPC601 may, with the proper 

qualification, assume mastership of the address bus. A qualified bus 
grant occurs when BG is asserted and ABB and ARTRY are not 
asserted. The ABB signal is driven by the MPC601 or another bus 
master, but ARTRY is driven only by the bus. If the MPC6()1 is 
parked, BR need not be asserted for the qualified bus grant. See 
Section 9.3. 1 , "Address Bus Arbitration." 

Negated — Indicates that the MPC601 is not the next potential 
address bus master. 

Timing Comments Assertion — May occur at any time to indicate the MPC6()1 is free to 
use the address bus. After the MPC601 assumes bus mastership, it 
does not check for a qualified bus grant again until the cycle during 
which the address bus tenure is completed (assuming it has another 
transaction to run). The MPC601 does not accept a BG in the cycles 
between the assertion of any TS or XATS and AACK. 

Negation — May occur at any time to indicate the MPC601 cannot 
use the bus. The MPC601 may still assume bus mastership on the bus 
clock cycle of the negation of BG because during the previous cycle 
BG indicated to the MPC601 that it was free to take mastership (if 
qualified). 
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8.2.1.3 Address Bus Busy (ABB) 

The address bus busy (ABB) signal is both an input and an output signal. 

8.2.1 .3.1 Address Bus Busy (TfBB)— Output 

Following are the state meaning and timing comments for the ABB output signal. 

State Meaning Asserted — Indicates that the MPC6()1 is the address bus master. See 

Section 9.3.1, "Address Bus Arbitration." 

Negated — Indicates that the MPC601 is not using the address bus. 
If ABB is negated during the bus clock cycle following a qualified 
bus grant, the MPC601 did not accept mastership, even if BR was 
asserted. This can occur if a potential transaction is aborted 
internally before the transaction is started. 

Timing Comments Assertion — Occurs on the bus clock cycle following a qualified Wj 
that is accepted by the processor (see Negated). 

Negation — Occurs on the bus clock cycle following the assertion of 
AACK. If ABB is negated during the bus clock cycle following a 
qualified bus grant, the MPC601 did not accept mastership, even if 
BR! was asserted. 

High Impedance — Occurs one-half processor clock cycle after ABB 
is negated. 

8.2.1.3.2 Address Bus Busy (Mn)~lnput 

Following are the state meaning and timing comments for the ABB input signal. 

State Meaning Asserted — Indicates that the address bus is in use. This condition 

effectively blocks the MPC601 from assuming address bus 
ownership, regardless of the BG input. Optional. (See Section 9.3.1, 
"Address Bus Arbitration.") 

Negated — Indicates that the address bus is not owned by another bus 
master and that it is available to the MPC601 when accompanied by 
a qualified bus grant. 

Timing Comments Assertion — May occur when the MPC601 must be prevented from 
using the address bus (and the processor is not currently asserting 
ABB). 

Negation — May occur whenever the MPC601 can use the address 
bus. 

Note that this signal is logically ORed with an internally generated address bus busy signal. 
For more information, see Section 9.3.1, "Address Bus Arbitration," for more information. 
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8.2.2 Address Transfer Start Signals 

Address transfer start signals are input and output signals that indicate that an address bus 
transfer has begun. The transfer start (T5) signal identifies the operation as a memory 
transaction; extended address transfer start (XATS) identifies the transaction as an I/O 
controller interface operation. 

For detailed information about how TS' and XATS interact with other signals, refer to 
Section 9.3.2, "Address Transfer," and Section 9.6, "I/O Controller Interface Operation," 
respectively. 

8.2.2.1 Transfer Start (TE) 

The transfer start (Ti?) signal is both an input and an output signal on the MPC601. 

8.2.2.1 .1 Transfer Start (TS)— Output 

Following are the state meaning and timing comments for the TS output signal. 

State Meaning Asserted — Indicates that the MPC601 has begun a memory bus 

transaction and that the address-bus and transfer-attribute signals are 
valid. It is also an implied data bus request for a memory transaction 
(unless it is an address-only operation.) 

Negated — Is negated during an I/O controller interface operation. 

Timing Comments Assertion — Coincides with the assertion of ABB. 

Negation — Occurs one bus clock cycle after T5 is asserted. 
High Impedance — Coincides with the negation of ABB. 

8.2.2.1 .2 Transfer Start (TS)— Input 

Following are the state meaning and timing comments for the TS input signal. 

State Meaning Asserted — Indicates that another master has begun a bus transaction 

and that the address bus and transfer attribute signals are valid for 
snooping (see GBL). 

Negated — Has no meaning. 

Timing Comments Assertion — May occur during the assertion of ABB. 

Negation — Must occur one bus clock cycle after TS is asserted. 



8.2.2.2 Extended Address Transfer Start (XATS) 

The extended address transfer start (XATS) signal is both an input and an output signal on 
theMPCaOl. 
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8.2.2.2.1 Extended Address Transfer Start (XSTS)— Output 

Following are the state meaning and timing comments for the XATS output signal. 

State Meaning Asserted — Indicates that the MPC601 has begun an I/O controller 

interface operation and that the first address cycle is valid. It is also 
an implied data bus request for certain I/O controller interface 
operation (unless it is an address-only operation.) 

Negated — Is negated during an entire memory transaction. 

Timing Comments Assertion — Coincides with ABB. 

Negation — Occurs one bus clock cycle after the assertion of XATS. 

High Impedance — Coincides with the negation of ABB. 

8.2.2.2.2 Extended Address Transfer Start (XATS)— Input 

Following are the state meaning and timing comments for the XATS input signal. 

State Meaning Asserted — Indicates that the MPC601 must check for an I/O 

controller interface operation reply operation with a receiver tag that 
matches bits 28-31 of the MPC601 FID register. 

Negated — Indicates that there is no need to check for an I/O 
controller interface operation reply. 

Timing Comments Assertion — May occur while ABB is asserted. 

Negation — Must occur one bus clock cycle after XATS is asserted. 

8.2.3 Address Transfer Signals 

The address transfer signals are used to transmit the address and to generate and monitor 
parity for the address transfer. For a detailed description of how these signals interact, refer 
to Section 9.3.2, "Address Transfer." 

8.2.3.1 Address Bus(A0-A31) 

The address bus (A0-A31) consists of 32 signals that are both input and output signals. 

8.2.3.1.1 Address Bus (A0-A31)— Output 

Following are the state meaning and timing comments for the A()-A31 output signals. 



State Meaning 



Timing Comments 



Asserted/Negated — Represents the physical address of the data to be 
transferred. On burst transfers, the address bus presents the quad- 
word-aligned address containing the critical code/data that missed 
the cache. See Section 9.3.2, "Address Transfer." 

Assertion/Negation — Occurs on the bus clock cycle after a qualified 
bus grant (coincides with assertion of ABB andTS^.) 



High Impedance — Occurs one bus clock cycle after AACK is 
asserted. 
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8.2.3.1.2 Address Bus (A0-A31)— Input 

Following are the state meaning and timing comments for the A()-A31 input signals. 

State Meaning Asserted/Negated — Represents the physical address of a snoop 

operation. 

Timing Comments Assertion/Negation — Must occur on the same bus clock cycle as the 
assertion of TS. 

8.2.3.1.3 Address Bus (A0-A31)— Output (I/O Controller Interface 
Operations) 

Following are the state meaning and timing comments for the address bus signals (AO- 
A31) for output I/O controller interface operations on the MPC6{)1. 

State Meaning Asserted/Negated — For I/O controller interface operations where 

the MPC601 is the master, the address tenure consists of two packets 
(each requiring a bus cycle). For packet 0, these signals convey 
control and tag information. For packet 1 , these signals represent the 
physical address of the data to be transferred. 

Timing Comments Assertion/Negation^Address tenure consists of two beats. The first 
beat occurs on the bus clock cycle after a qualified bus grant, 
coinciding with XAfS. The address bus transitions to the second 
beat on the next bus clock cycle. 



High Impedance — Occurs on the bus clock cycle after AACK. is 
asserted. 

8.2.3.1.4 Address Bus (A0-A31)— Input (I/O Controller Interface 
Operations) 

Following are state meaning and timing comments for input I/O controller interface 
operations on the MPC601. 

State Meaning Asserted/Negated — When the MPC601 is not the master, it snoops 

(and checks address parity) on the first address beat only of all I/O 
controller interface operations for an I/O reply operation with a 
receiver tag that matches its PID tag. See Section 9.6, "I/O 
Controller Interface Operation." 

Timing Comments Assertion/Negation — ^The MPC601 looks for only the first beat of 
the I/O transfer address tenure, which coincides with XATS. The 
second address bus beat is not required by the MPC601 . 

8.2.3.2 Address Bus Parity (AP0-AP3) 

The address bus parity (AP0-AP3) signal is both an input and output signal that has four 
pin locations on the MPC6()1 . 
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8.2.3.2.1 Address Bus Parity (AP0-AP3)— Output 

Following are the state meaning and timing comments for the AP()-AP3 output signal on 
theMPC601. 

State Meaning Asserted/Negated — Represents odd parity for each of four bytes of 

the physical address for a transaction. By odd parity, an odd number 
of bits, including the parity bit, are driven high. The signal 
assignments correspond to the following: 

APO A()-A7 

API A8-A15 

AP2 A16-A23 

AP3 A24-A31 

For more information, see Section 9.3.2.1, "Address Bus Parity." 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 . 
High Impedance — ^The same as A()-A3 1 . 

8.2.3.2.2 Address Bus Parity (AP0-AP3)— Input 

Following are the state meaning and timing comments for the AP()-AP3 input signal on the 
MPC601. 



State Meaning 



Timing Comments 



Asserted/Negated — Represents odd parity for each of four bytes of 
the physical address for snooping and I/O controller interface 
operations. Detected even parity causes the processor to enter the 
checkstop state if address parity checking is enabled in the HIDO 
register (see Section 2,3.3.12.1, "Checkstop Sources and Enables 
Register — HIDO). (See also the APE signal description below). 

Assertion/Negation — ^The same as A0-A31. 



8.2.3.3 Address Parity Error (APE)— Output 

The address parity error (APE) signal is an output signal on the MPC601. Following are the 
state meaning and timing comments for the APE signal on the MPC601. For more 
information, see Section 9.3.2.1, "Address Bus Parity." 



State Meaning 



1 iming v^omments 



Asserted — Indicates incorrect address bus parity has been detected 
by the MPC601 on a snoop (GBL asserted). This includes the first 
address beat of an I/O controller interface operation. 

Negated — Indicates that the MPC601 has not detected a parity error 
(even parity) on the address bus. 

Assertion — Occurs on the second bus clock cycle after TS or XATS 
is asserted. 



Negation — Occurs on the third bus clock cycle after TS" or XATS is 
asserted. 
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8.2.4 Address Transfer Attribute Signals 

The transfer attribute signals are a set of signals that further characterize the transfer — such 
as the size of the transfer, whether it is a read or write operation, and whether it is a burst 
or single-beat transfer. For a detailed description of how these signals interact, see 
Section 9.3.2, "Address Transfer." 

Note that some signal functions vary depending on whether the transaction is a memory 
access or an I/O access. For a description of how these signals function for I/O controller 
interface operations, see Section 9.6, "I/O Controller Interface Operation." 

8.2.4.1 Transfer Type (TT0-TT4) 

The transfer type (TT()-TT4) signals consist of four input/output signals and one output- 
only signal on the MPC601 . For a complete description of TT0-TT4 signals, see Table 8-1 
and for transfer type encodings, see Table 8-2. 

8.2.4.1 .1 Transfer Type (TT0-TT4)— Output 

Following are the state meaning and timing comments for the TT()-TT4 output signals on 
theMPC601. 

State Meaning Asserted/Negated — Indicates the type of transfer in progress. These 

bits roughly correspond to the following decoded operations: 
•Atomic 
•Read/ write 
•Invalidate 
•Memory cycle 

For I/O controller interface operations these signals are part of the 
I/O transfer code along with TSIZ and TEST. For I/O controller 
interface operations these signals are part of the extended transfer 
code along with TSIZ and TEST: 

XATC(0:7)=TT(0:3)IITHSTIITSIZ(0:2). 

TT4 is driven negated as an output on the MPC601 and is defined for 
future expansion. 

Timing Comments Assertion/Negation/High Impedance — The same as A()-A3 1 . 

8.2.4.1.2 Transfer Type (TT0-TT3)— Input 

Following are the state meaning and timing comments for the TT()-TT3 input signals on 
theMPC6()l. 

State Meaning Asserted/Negated — Indicates the type of transfer in progress (see 

Table 8-2). For I/O controller interface operations these signals form 
part of the extended address transfer code (XATC) and are snooped 
by the MPC601 if X7^ is asserted. 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 . 



8-10 PowerPC 601 RISC Microprocessor User's Manual MOTOROLA 



Table 8-1. TT0-TT4 Signal Description 



Signal 


Description 


TTO 


Special operations: The MPC601 drives tliis signal to indicate that the access is part of an atomic data 
access sequence. This signal is asserted whenever a bus transaction is run in response to a 
Iwarx/stwcx instruction pair, a tibi operation, or either an eciwx or ecowx instruction. 


TT1 


Read (/write) operations: This signal indicates whether the transaction is a read (TT1 high) or a write 
(TT1 low). 


TT2 


invalidate operations: When asserted with fiBL, the TT2 output signal indicates that all other caches in 
the system should invalidate the cache entry on a snoop hit. If the snoop hit is to a modified entry, the 
sector should be copied back before being invalidated. 


TT3 


Address-only operations: This signal, when asserted, indicates that the data transfer is to/from memory. 
External logic can synthesize a data bus request from the combined assertions of TS (or KAliS) andTT3. 
If TT3 is not asserted with the address, the associated bus transaction is considered to be a broadcast 
operation that all potential bus masters must honor (or a reserved operation), except for the external 
control functions (eciwx and ecowx) which require both address and data tenures. 


TT4 


Reserved. Always negated (low state). (For expandability) 



Table 8-2 describes the encodings for TT0-TT3. 

Table 8-2. Transfer Type Encodings 



TTO 


TT1 


TT2 


TT3 


Operation 


Bus Transaction 


Comment 














Clean sector 


Address only 


Due to cache control 
operation^ 











1 


Write with flush 


Single-beat write 


— 








1 





Flush sector 


Address only 


Due to cache control 
operation ^ 








1 


1 


Write with kill 


Burst 


Cache sector writes 
(replacement sector copy 
backs and snoop push 
operations) 





1 








sync 


Address only 


Due to cache control 
operation ^ 





1 





1 


Read 


Single-beat read or burst 


— 





1 


1 





Kill sector 


Address only 


Store hit on shared sector 
or cache control operation ^ 





1 


1 


1 


Read with intent to modify 


Burst 


Store cache miss 













— 


— 


Reserved 










1 


Write with flush atomic 


Single-beat write 


Caused by stwcx 







1 





External control out 


Single-beat write 


Caused by ecowx ^ 







1 


1 


— 


— 


Reserved 


1 


1 








TLB invalidate 


Address only 


— 
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Table 8-2. Transfer Type Encodings (Continued) 



TTO 


TT1 


112 


173 


Operation 


Bus Transaction 


Comment 


1 


1 





1 


Read atomic 


Single-beat read or burst 


Caused by Iwarx 
instruction 


1 


1 


1 





External control in 


Single-beat read 


Caused by ecowx ^ 


1 


1 


1 


1 


Read with intent to modify 
atomic 


Burst 


Caused by stwcx 
instruction 



^ Cache control operations resulting from explicit cache control instructions (for example, dcif, sync, dclz, 
deli). 

^he signal encodings for these operations do not use the TTO and TT3 signals in the manner described in 
Table 8-1. 
Note that TT4 is reserved. 

8.2.4.2 Transfer Size (TSIZ0-TSIZ2) 

The transfer size (TSIZ()-TSIZ2) signals consist of three input/output signals on the 
MPC6()1. 

8.2.4.2.1 Transfer Size (TSIZ0-TSIZ2)— Output 

Following are state meaning and timing comments for the TSIZ()-TSIZ2 output signals on 
theMPC601. 

State Meaning Asserted/Negated — For memory accesses, these signals along with 

TEST, indicate the data transfer size for the current bus operation, as 
shown in Table 8-3. Table 9-2 shows how the TSIZ signals are used 
with the address signals for aligned transfers. Table 9-3 shows how 
the TSIZ signals are used with the address signals for misaligned 
transfers. For I/O transfer protocol, these signals form part of the I/O 
transfer code (see the entry in this table for TT0-TT4). 

For external control instructions (eciwx and ecowx), TSIZ()-TSIZ2 
are used to output bits 29-31 to the EAR, which are used to form the 
resource ID (T^STIITSIZ0-TSIZ3). 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 . 
High Impedance — ^The same as A()-A3 1 . 
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Table 8-3. 


Data Transfer Size 


TBST 


TSIZO- 
TSIZ2 


Transfer 
Size 


Asserted 


010 


Burst 


Negated 


000 


8 bytes 


Negated 


001 


1 byte 


Negated 


010 


2 bytes 


Negated 


oil 


3 bytes 


Negated 


100 


4 bytes 


Negated 


101 


5 bytes 


Negated 


110 


6 bytes 


Negated 


111 


7 bytes 



8.2.4.2.2 Transfer Size (TSIZ0-TSIZ2)— Input 

Following are state meaning and timing comments for the TSIZ()-TS1Z2 input signals on 
theMPC601. 

State Meaning Asserted/Negated — Represents the size of the current transfer, as 

shown in Table 8-3. For the I/O controller interface protocol, these 
signals form part of the I/O transfer code (see TT). 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 . 



8.2.4.3 Transfer Burst (TBST) 

The transfer burst (TEST) signal is an input/output signal on the MPC601. 

8.2.4.3.1 Transfer Burst (TBST)— Output 

Following are the state meaning and timing comments for the TBST output signal. 

State Meaning Asserted — Indicates that a burst transfer is in progress when asserted 

and TSIZ0-TSIZ2 are set to 010. 

Negated — Indicates that a burst transfer is not in progress. Also, part 
of I/O transfer code (see TT). 

For external control instructions (eciwx and ecowx), TBST are used 
to output bit 28 to the EAR, which are used to form the resource ID 
(TBSTIITSIZ0-TSIZ3). 

Timing Comments Assertion/Negation — ^The same as A0-A31 
High Impedance — ^The same as A0-A31. 
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8.2.4.3.2 Transfer Burst (TB5T)— Input 

Following are state meaning and timing comments for the TEST input signal. 

State Meaning Asserted/Negated — Indicates that a burst transfer is in progress 

when asserted and TSIZ()-TSIZ2 are set to 010. For the I/O transfer 
protocol, this signal forms part of the I/O transfer code (see the entry 
in this table for TT). 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 

8.2.4.4 Transfer Code (TC0-TC1)— Output 

The transfer code (TCO-TCl) consists of four output signals on the MPC601. Following 
are the state meaning and timing comments for the TCO-TCl signals. 

State Meaning Asserted/Negated — Represents a special encoding for the transfer in 

progress (see Table 8-4). 

Timing Comments Assertion/Negation — ^The same as A0-A3 1 . 
High Impedance — ^The same as A()-A31. 

Table 8-4. Encodings for TC0-TC3 



Signal 



Description 



TOO 



Depends on whether the current transaction is a read or write operation; therefore, TCO should be 
used with TT1 . On a read operation, TCO asserted indicates the transaction is an instruction fetch 
operation; otherwise, the read operation is a data operation. 

Asserting TCO for write operations indicates the write is a response to a snoop hit to modified data; 
TCO negated indicates the write is nof a snoop push (it is therefore a cache cast-out, write-through, or 
cache-inhibited write operation). 



TC1 



TC1 , when asserted, indicates that an operation to reload the other sector is queued; therefore, the 
next bus transaction will likely be to the same page of memory. After the addressed sector in a cache 
line is loaded from memory, the f^PC601 attempts to load the other sector in the cache line. This is a 
low-priority bus operation and may not be the next transaction. The assertion of TC1 suggests that the 
next access may be to the same page; the hint may be wrong depending on the bus traffic/code 
execution dynamics. 



8.2.4.5 Cache Inhibit (UT)— Output 

The cache inhibit (O) signal is an output signal on the MPC601 . Following are the state 
meaning and timing comments for the UT signal. 

State Meaning Asserted — Indicates that a single-beat transfer will not change the 

cache, reflecting the setting of the I bit for the address of the current 
transaction. 

Negated — Indicates that a burst transfer will allocate a sector in the 
MPC601 data cache. 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 
High Impedance — ^The same as A0-A31. 
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8.2.4.6 Write-through (WT)— Output 

The write- through (WT) signal is an output signal on the MPC6()1. Following are the state 
meaning and timing comments for the WT signal. 

State Meaning Asserted — Indicates that a single beat transaction is write-through, 

reflecting the value of the W bit for the address of the current 
transaction. 

Negated — Indicates that a transaction is not write-through. 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 
High Impedance — ^The same as A0-A31. 



8.2.4.7 Global (CaBL) 

The global (GBL) signal is an input/output signal on the MPC601. 

8.2.4.7.1 Global (?aBC)— Output 

Following are the state meaning and timing comments for the GBL output signal. 

State Meaning Asserted — Indicates that a transaction is global, reflecting the setting 

of the M bit for the address of the current transaction (except in the 
case of copy-back operations, which are non-global.) 

Negated — Indicates that a transaction is not global. 

Timing Comments Assertion/Negation — ^The same as A()-A31 
High Impedance — ^The same as A()-A31 

8.2.4.7.2 Global (^aBE)— Input 

Following are the state meaning and timing comments for the GBL input signal. 

State Meaning Asserted — Indicates that a transaction must be snooped by the 

MPC601. 

Negated — Indicates that a transaction is not snooped by the MPC601 
(even if TT()-TT4 indicate an invalidation transaction). 

Timing Comments Assertion/Negation — ^The same as A()-A3 1 . 

8.2.4.8 Cache Set Element (CSE0-CSE2)— Output 

The cache set element (CSE0-CSE2) signals consist of three output signals on the 
MPC60L Following are state meaning and timing comments for the CSE signals. 

State Meaning Asserted/Negated — Represents the cache replacement set element 

for the current transaction reloading into or writing out of the cache. 
Can be used with the address bus and the transfer attribute signals to 
externally track the state of each cache sector in the MPC601 's 
cache. 
See Section 4.7.4, "MESI Hardware Considerations." 
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Timing Comments Assertion/Negation — ^The same as A()-A31 
High Impedance — ^The same as A()-A3 1 . 

8.2.4.9 High-Priority Snoop Request (HP_SNP_REQ) 

The High-priority Snoop Request (HP_SNP_RliQ) signal is an input signal (input-only) on 
the MPC6()1. Following are state meaning and timing comments for the HP_SNP_REQ 
signal 

State Meaning Asserted — Indicates that the MPC601 may add an additional 

reserved queue to the list of available queues for push transactions 
that are a result of a snoop hit. 

Negated — Indicates that the MPC601 will not make available the 
reserved queue for a snoop hit push resulting from a transaction. This 
is the "normal" mode. 

Timing Comments Assertion/Negation — ^The same as A(>-A3 1 . 

NOTE: This pin is a feature of the MPC601 only and will not be available in any other 
PowerPC processors. 

8.2.5 Address Transfer Termination Signals 

The address transfer termination signals are used to indicate either that the address phase 
of the transaction has completed successfully or must be repeated, and when it should be 
terminated. These signals are also used to maintain MESI protocol. For detailed 
information about how these signals interact, see Section 9.3.3, "Address Transfer 
Termination." 

8.2.5.1 Address Acknowledge (AACK)— Input 

The address acknowledge (AACK) signal is an input signal (input-only) on the MPC601. 
Following are state meaning and timing comments for the AACK signal. 

State Meaning Asserted — Indicates that the address phase of a transaction is 

complete. The address bus will go to a high impedance state on the 
next bus clock cycle. The MPC601 samples ARTRY on the bus clock 
cycle following the assertion of AACK. 

Negated — (During ABB) indicates that the address bus and the 
transfer attributes must remain driven. 

Timing Comments Assertion — May occur as early as the bus clock cycle after TS or 
XATS is asserted; assertion can be delayed to allow adequate address 
access time for slow devices. For example, if an implementation 
supports slow snooping devices, an external arbiter can postpone the 
assertion of AACK. 

Negation — Must occur one bus clock cycle after the assertion of 
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8.2.5.2 Address Retry (ARTRY) 

The address retry (ARTRY) signal is input/output signal on the MPC601, 

8.2.5.2.1 Address Retry (ARTRY)— Output 

Following are the state meaning and timing comments for the ARTRY output signal. 

State Meaning Asserted — Indicates that the MPC601 detects a condition in which a 

snooped address tenure must be retried (see SHD for encoding). If 
the MPC601 needs to update memory as a result of the snoop that 
caused the retry, the MPC601 asserts BT^ (unless it is parked). 

High Impedance — Indicates that the MPC601 does not need the 
snooped address tenure to be retried. 

Timing Comments Assertion — Occurs two bus cycles immediately following the 
assertion of T5 if a retry is required. 



Negation — Occurs the bus cycle after the assertion of AACK. Since 
this signal may be simultaneously driven by multiple devices, it 
negates in a unique fashion. First the buffer goes to high impedance 
for one bus cycle, then it drives high for one 2XPCLK cycle before 
returning to high impedance. 

This special method of negation may be disabled using the mtspr 
instruction to write bit 29 of the HIDO register. 

Table 8-6 shows the relationship between the SHD and AR I'RY signals. 
Table 8-5. SFTCf and ARTRY Signals 



SHD 


wrrny 


Description 


z 


z 


No snoop hit, no busy 
pipeline 


z 


A 


Pipeline busy 


A 


z 


Snoop hit shared 


A 


A 


Snoop hit modified 



8.2.5.2.2 Address Retry (ARTRY)— Input 

Following are the state meaning and timing comments for the ARTRY input signal. 

State Meaning Asserted — If the MPC601 is the address bus master, ARTRY 

indicates that the MPC601 must retry the preceding address tenure 
and immediately negate BR (if asserted). If the IvIFCoOl is not the 
address bus master, this input indicates that the MPC601 should 
immediately negate HR! for one bus clock cycle following the 
negation of ARTRY. Note that the subsequent address retried may 
not be the same one associated with the assertion of the ARTRY 
signal. 
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Negated/High Impedance — Indicates that the MPC601 does not 
need to retry the last address tenure. 

Timing Comments Assertion — Must occur by the bus clock cycle immediately 
following the assertion of AACK if a retry is required. 

Negation — Must occur during the second cycle after the assertion of 
AACK. Note that this signal is sampled only following the assertion 
of AACK. 



8.2.5.3 Shared (SHD) 

The shared (SHD) signal is an input/output signal on the MPC601. 

8.2.5.3.1 Shared (SHU)— Output 

Following are the state meaning and timing comments for the SHD output signal. 

State Meaning Asserted — Indicates that the MPC601 either needs the data to be 

shared (in response to a snoop hit for transaction not requiring 
invalidation) or with ARTRY indicates the MPC6()1 has a hit on a 
cache sector marked as modified. 

Negated/High Impedance — Indicates that the MPC601 did not have 
a cache hit on the snooped address. 

Timing Comments Assertion — ^The same as AR'l'RY. 
Negation — The same as ARTRY. 
High Impedance — ^The same as ARTRY. 

See Table 8-6 for information on SHD and ARIRY signals. 

Table 8-6. SHU and ARTRY Signals 



SHD 


OTTHV 


Description 


z 


z 


No snoop hit, no busy 
pipeline 


z 


A 


Pipeline busy 


A 


Z 


Snoop hit shared 


A 


A 


Snoop hit modified 



8.2.5.3.2 Shared (SHT5)— Input 

Following are the state meaning and timing comments for the SHD input signal. 

State Meaning Asserted — Indicates that for a self- generated transaction, the 

MPC601 must allocate the incoming sector as shared (unmodified). 
Or if ARTRY is asserted, the transaction must be retried while the 
other master updates memory. 
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Negated — Indicates that the address for the current transaction is not 
in any other cache. 

Timing Comments Assertion — ^The same as ARIRY. 
Negation — The same as ARTRY. 

8.2.6 Data Bus Arbitration Signals 

Like the address bus arbitration signals, data bus arbitration signals maintain an orderly 
process for determining data bus mastership. Note that there is no data bus arbitration signal 
equivalent to the address bus arbitration signal BR (bus request), because, except for 
address-only transactions, T5 and XATS imply data bus requests. For a detailed description 
on how these signals interact, see Section 9.4.1, "Data Bus Arbitration." 

One special signal, DBWO, allows the MPC601 to be configured dynamically to write data 
out of order with respect to read data. For detailed information about using DBWO, see 
Section 9.10, "Using DBWO (Data Bus Write Only)." 



8.2.6.1 Data Bus Grant (DBG)— input 

The data bus grant (DBG) signal is an input signal (input-only) on the MPC601. Following 
are the state meaning and timing comments for the DBG signal. 

State Meaning Asserted — Indicates that the MPC601 may, with the proper 

qualification, assume mastership of the data bus. The MPC601 
derives a qualified data bus grant when DBG is asserted and DBB, 
DRIRY, and ARTRY are negated; that is, the data bus is not busy 
(DBB is negated) and there is no outstanding attempt to retry the 
associated address tenure (ARTRY is negated) or the current data 
tenure (DRTRY is negated). 

Negated — Indicates that the MPC601 must hold off its data tenures. 

Timing Comments Assertion — May occur any time to indicate the MPC601 is free to 
take data bus mastership. It is not sampled until T5 is asserted. 

Negation — May occur at any time to indicate the MPC601 cannot 
assume data bus mastership. 

8.2.6.2 Data Bus Write Oniy (DBWO)— input 

The data bus write only (DBWO) signal is an input signal (input-only) on the MPC601. 
Following are the state meaning and timing comments for the DBWO signal. 
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State Meaning Asserted — Indicates that the MPC601 may run the data bus tenure 

for an outstanding write address even if a read address is pipelined 
before the write address. If DBWU is asserted, the MPC601 only 
assumes data bus ownership for a pending data bus write operation 
(that is, the MPC601 does not take the data bus fo r a pending read 
operation if this input is asserted along with DBCi). Care must be 
taken with Wj to ensure the desired write is queued (such as a snoop 
hit push). Refer to Section 9.10, "Using D BWO (D ata Bus Write 
Only)," for detailed instructions for using DBWO. 

Negated — Indicates that the MPC601 must run the data bus tenures 
in the same order as the address tenures. 

Timing Comments Assertion — Must occur no later than a qualified DBG for a previous 
write tenure. Do not assert if no pending data bus write tenures are 
pending from previous address tenures. 

Negation — May occur any time after a qualified DBG and before the 
next assertion of DBG. 

8.2.6.3 Data Bus Busy (DBB) 

The data bus busy (DBB) signal is input/output signal on the MPC6()1. 

8.2.6.3.1 Data Bus Busy (DBB)— Output 

Following are the state meaning and timing comments for the DBB output signal. 

State Meaning Asserted — Indicates that the MPC601 is the data bus master. The 

MPC601 always assumes data bus mastership if it needs the data bus 
and is given a qualified data bus grant (see DBG). 

Negated — Indicates that the MPC601 is not using the data bus. 

Timing Comments Assertion — Occurs during the bus clock cycle following a qualified 
DBG. 

Negation — Occurs during the bus clock cycle following the 
assertion of the final TK. 

High Impedance — Occurs one-half processor clock cycle after DBB 
is negated. 

8.2.6.3.2 Data Bus Busy (U^)— input 

Following are the state meaning and timing comments for the DBB input signal. 

State Meaning Asserted — Indicates that another device is bus master. 

Negated — Indicates that the data bus is free (with proper 
qualification, see DBG) for use by the MPC601 . 
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Timing Comments Assertion — Must occur when the MPC601 must be prevented from 
using the data bus. 

Negation — May occur whenever the data bus is available. 

8.2.7 Data Transfer Signals 

Like the address transfer signals, the data transfer signals are used to transmit data and to 
generate and monitor parity for the data transfer. For a detailed description of how the data 
transfer signals interact, see Section 9.4.2, "Data Transfer." 

8.2.7.1 Data Bus (DH0-DH31, DL0-DL31) 

The data bus (DH()-DH31 and DL0-DL31) consists of 64 signals that are both input and 
output on the MPC601. Following are the state meaning and timing comments for the DH 
and DL signals. 

State Meaning The data bus has two halves — data bus high (DH) and low (DL). See 

Table 8-6 for the data bus lane assignments. Data byte lanes are 
illustrated in Figure 9-10. I/O controller interface operations use DH 
exclusively (that is, there are no 64-bit, I/O transfers). 

Timing Comments The data bus is driven once for non-cached transactions and four 
times for cache transactions (bursts). 

Table 8-7. Data Bus Lane Assignments 



Data Bus Signals 


Byte Lane 


DH0-DH7 





DH8-DH15 


1 


DH16-DH23 


2 


DH24-DH31 


3 


DL0-DL7 


4 


DL8-DL15 


5 


DL16-DL23 


6 


DL24-DL31 


7 



8.2.7.1 .1 Data Bus (DH0-DH31 , DL0-DL31)— Output 

Following are the state meaning and timing comments for the DH and DL output signals. 

Asserted/Negated — Represents the state of data during a data write. 
Unused byte lanes are driven to deterministic values. 

Assertion/Negation — Initial beat coincides with DBF and, for 
bursts, transitions on the bus clock cycle following each assertion of 
TK. 



State Meaning 
Timing Comments 
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High Impedance — Occurs on the bus clock cycle after the tinal 
assertion of TK. 

8.2.7.1.2 Data Bus (DH0-DH31, DL0-DL31)— Input 

Following are the state meaning and timing comments for the DH and DL input signals. 

State Meaning Asserted/Negated — Represents the state of data during a data read 

transaction. 

Timing Comments Assertion/Negation — Must occur on the same bus clock cycle that 
TK'is asserted; however, if DRTRY is asserted, it must coincide with 
the assertion of the final DRTRY for a given data beat. 

8.2.7.2 Data Bus Parity (DP0-DP7) 

The eight data bus parity (DP()-DP7) signals on the MPC601 are both output and input 
signals. 

8.2.7.2.1 Data Bus Parity (DP0-DP7)— Output 

Following are the state meaning and timing comments for the DP output signals. 

State Meaning Asserted/Negated — Represents odd parity for each of eight bytes of 

data write transactions. The signal assignments are listed in 
Table 8-8. 

Timing Comments Assertion/Negation — ^The same as DL{)-DL3 1 
High Impedance — ^The same as DL0-DL3 1 

Table 8-8. DP0-DP7 Signal Assignments 



Signal Name 


Signal Assignments 


DPO 


DH0-DH7 


DP1 


DH8-DH15 


DP2 


DH16-DH23 


DP3 


DH24-DH31 


DP4 


DL0-DL7 


DPS 


DL8-DL15 


DP6 


DL16-DL23 


DP7 


DL24-DL31 



8-22 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 



8.2.7.2.2 Data Bus parity (DP0-DP7)— Input 

Following are the state meaning and timing comments for the DP input signals. 

State Meaning Asserted/Negated — Represents odd parity for each byte of read 

data. Parity is checked on all data byte lanes, regardless of the size 
of the transfer. Detected even parity causes a machine-check 
exception if data parity errors are enabled in the ME bit of the MSR. 
(SeeDPE.) 

Timing Comments Assertion/Negation — The same as DL()-DL3 1 . 



8.2.7.3 Data Parity Error (DPE)— Output 

The data parity error (DPE) signal is an output signal (output-only) on the MPC601. 
Following are the state meaning and timing comments for the DPE signal. 

State Meaning Asserted — Indicates incorrect data bus parity. 

Negated — Indicates correct data bus parity 

Timing Comments Assertion — Occurs on the second bus clock cycle after TK is 
asserted to the MPC601. 

Negation — Occurs on the third bus clock cycle after T/^ is asserted 
totheMPC601. 

8.2.8 Data Transfer Termination Signals 

Data termination signals are required after each data beat in a data transfer. Note that in a 
single-beat transaction, the data termination signals also indicate the end of the tenure, 
while in burst accesses, the data termination signals apply to individual beats and indicate 
the end of the tenure only after the final data beat. 

These signals are also used to maintain MESI protocol. For a detailed description of how 
these signals interact, see Section 9.4.3, "Data Transfer Termination." 

8.2.8.1 Transfer Acknowledge (TA) — Input 

The transfer acknowledge (TK) signal is an input signal (input-only) on the MPC601. 
Following are state meaning and timing comments for the TA signal. 

State Meaning Asserted — Indicates that a single-beat data transfer completed 

successfully or that a data beat in a burst transfer completed 
successfully (unless DRTRY is asserted on the next bus clock cycle). 
Note that TK must be asserted for each data beat in a burst 
transaction. For more information refer to Section 9.4.3, "Data 
Transfer Termination." 

Negated — (During T5BH) indicates that, until TA is asserted, the 
MPC6()1 must continue to drive the data for the current write or must 
wait to sample the data for reads. 
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Timing Comments Assertion — Must not occur before AACK for the current transaction 
(if the address retry mechanism is to be used; otherwise, assertion 
may occur at any time during the assertion of DBB. The system can 
withhold assertion of TSto indicate that the MPC601 should insert 
wait states to extend the duration of the data beat. 

Negation — Must occur after the bus clock cycle of the final (or only) 
data beat of the transfer. For a burst transfer, the system can assert TK 
for one bus clock cycle and then negate it to advance the burst 
transfer to the next beat and insert wait states during the next beat. 

8.2.8.2 Data Retry (DRTRY)— Input 

The data retry (DRTRY) signal is input only on the MPC601 . Following are state meaning 
and timing comments for the DRTRY signal. 

State Meaning Asserted — Indicates that the MPC601 must invalidate the data from 

the previous read operation. 

Negated — Indicates that data presented with TS on the previous read 
operation is valid. This is essentially a late TK to allow speculative 
forwarding of data (with TK) during reads. Note that DRTRY is 
ignored for write transactions 

Timing Comments Assertion — Must occur during the bus clock cycle immediately after 
TK is asserted if a retry is required. The DRTRY signal may be held 
asserted for multiple bus clock cycles. 

Negation — Must occur during the bus clock cycle after a valid data 
beat. This may occur several cycles after DBB is negated, effectively 
extending the data bus tenure. 



8.2.8.3 Transfer Error Acknowledge (TEA)— Input 

The transfer error acknowledge (TEA) signal is input only on the MPC601 . Following are 
state meaning and timing comments for the TEA signal. 

State Meaning Asserted — Indicates that a bus error occurred. Causes a machine 

check exception (and possibly causes the processor to enter 
checkstop state if machine check enable bit is cleared (MSR[ME] = 
0). For more information see Section 5.4.2.2, "Checkstop State 
(MSR[ME] = ())." Assertion terminates the current transaction; that 
is, assertion of TSand DRTRY are ignored. The assertion of TEA 
causes the negation/high impedance of DBB in the next clock cycle. 
However, data entering the GPR or the cache are not invalidated. 

Negated — Indicates that no bus error was detected. 
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Timing Comments Assertion — May be asserted while DBB and/or "DRTRY is asserted. 

Negation — TEA must be asserted for at least one bus clock cycle. 
TEA must be negated no later than the negation of DBB or the last 
DRTRY. 

8.2.9 System Status Signals 

Most system status signals are input signals that indicate when exceptions are received, 
when checkstop conditions have occurred, and when the MPC6()1 must be reset. The 
MPC601 generates the output signal, CKSrP_OUr, when it detects a checkstop condition. 
For a detailed description of these signals, see Section 9.7, "Interrupt, Checkstop, and Reset 
Signals." 

8.2.9.1 Interrupt (TFTT)— Input 

The interrupt (INT) signal is input only. Following are state meaning and timing comments 
for the TNT signal. 

State Meaning Asserted — Indicates that if the MSR[EE] (bit 16, the external 

interrupt enable bit) is set, the MPC6()1 begins processing an 
external interrupt exception. 

Negated — Indicates that normal operation should proceed. See 
Section 9.7.1, "External Interrupt." 

Timing Comments Assertion — May occur at any time. 

Negation — May occur any time after the minimum pulse width has 
been met. (Minimum pulse width is 3 processor clock cycles.) After 
the minimum pulse width has been met, an interrupt exception 
occurs, 

8.2.9.2 Checkstop Input (CKSTPJN)— Input 

The checkstop input (CKSrP_lN) signal is input only on the MPC601 . Following are state 
meaning and timing comments for the CKSTP_1N signal. 

State Meaning Asserted — Indicates that the MPC601 must terminate operation by 

internally gating off all clocks. Once CKSrP_lN has been asserted 
it must remain asserted until the system has been reset; otherwise the 
clocks resume operation. 

Negated — Indicates that normal operation should proceed. See 
Section 9.7.2, "Checkstops." 
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Timing Comments Assertion — May occur at any time and may be asserted 

asynchronously to the input clocks. CKSrP_lN must be asserted for 
a minimum of three PCLK_HN clock cycles. Or, it may be asserted 
synchronously meeting setup and hold times (specified in the 
electrical specifications) and must be asserted for at least two 
PCLK_HN clock cycles. 

Negation — May occur any time after the UKSTPIOUT output signal 
has been asserted. 



8.2.9.3 Checkstop Output (CKSTP_OUT)— Output 

The checkstop output (CKS'iP_OU'l') signal is output only on the MPC6()1 . Following are 
state meaning and timing comments for the CKSTPjTDT signal. 

State Meaning Asserted — Indicates that the MPC601 has detected a checkstop 

condition and has ceased operation. 

Negated — Indicates that the MPC6()1 is operating normally. 
See Section 9.7.2, "Checkstops." 

Timing Comments Assertion — May occur at any time and may be asserted 
asynchronously to the MPC601 input clocks. 

Negation — Requires HRESET assertion. 

8.2.9.4 Reset Signals 

There are two reset signals on the MPC6()1 — hard reset (HRESET) and soft reset 
(SRESET). Descriptions of each follows. 

8.2.9.4.1 Hard Reset (HRESET)— Input 

The hard reset (HRESET) signal is input only and must be used at power-on to properly 
reset the processor. Following are state meaning and timing comments for the HRESET 
signal. 

State Meaning Asserted — Initiates a complete hard reset operation when this input 

transitions from asserted to negated. Causes a reset exception as 
described in Section 5.4.1.2, "Hard Reset." Output drivers are 
released to high impedance within three clocks after the assertion of 
HRESET. 

Negated — Indicates that normal operation should proceed. See 
Section 9.7.3, "Reset Inputs." 
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Timing Comments Assertion — May occur at any time and may be asserted 
asynchronously to the MPC601 input clocks. 
Negation — May occur any time after the minimum reset pulse width 
has been met. (Minimum pulse width is 300 processor clock cycles.) 

This input has additional functionality in certain test modes. 

8.2.9.4.2 Soft Reset (SRESET)— Input 

The soft reset (SRESET) signal is input only. Following are state meaning and timing 
comments for the SRESET signal. 

State Meaning Asserted — Initiates processing for a a reset exception as described 

in Section 5.4.1.1, "Soft Reset." 

Negated — Indicates that normal operation should proceed. See 
Section 9.7.3, "Reset Inputs." 

Timing Comments Assertion — May occur at any time. 

Negation — May occur any time after the minimum soft-reset pulse 
width has been met. (Minimum pulse width is 10 processor clock 
cycles.) 

This input has additional functionality in certain test modes. 



8.2.9.5 System Quiesced (SYS_QUIESC) 

The system quiesced (SYS_QUIESC) signal is input only. Following are state meaning and 
timing comments for the SYS_QUiESC signal. 

State Meaning Asserted — Enables soft stop in the MPC601 

Negated: indicates that soft stop is not enabled in the 
MPC60 1 processor. 

Timing Comments Assertion/Negation — Must meet setup and hold times as described 
in the electrical specifications. 

8.2.9.6 Resume (RESUME) 

The resume (RESUME) signal is input only. Following are state meaning and timing 
comments for the RESUME signal. 

State Meaning Asserted — Restarts the MPC601 after a soft stop. 

Negated — Indicates that the MPC601 is not allowed to resume 
normal operation if a soft stop has occurred. 
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Timing Comments Assertion — May occur at any time and may be asserted 

asynciironously to the MPC6()1 input clocks. RESUME must be 
asserted for a minimum of three PCLK_HN clock cycles. Or, it may 
be asserted synchronously meeting setup and hold times (specified in 
the electrical specifications) and must be asserted for at least two 
PCLK_EN clock cycles. For information, see the MPC601 electrical 
specifications. 

Negation — May occur any time after the minimum pulse width has 
been met. 

8.2.9.7 Quiesce Request (QUIESC_REQ) 

The quiesce request (QUIESC_REQ) signal is output only. Following are state meaning 
and timing comments for the QUIESC_REQ signal. 

State Meaning Asserted — Indicates that the MPC601 is requesting a soft stop for 

the system. 

Negated — Indicates that the MPC601 is operating normally. 

Timing Comments Assertion — May occur at any time to indicate that the MPC6()1 is 
requesting a soft stop. 

Negation: may occur at any time to indicate that the MPC601 is not 
requesting a soft stop. 

8.2.9.8 Reservation (RSRV)— Output 

The reservation (RSRV) signal is output only on the MPC601. Following are state meaning 
and timing comments for the RSRV signal. 

State Meaning Asserted/Negated — Represents the state of the reservation 

coherency bit in the reservation address register that is used by the 
Iwarx and stwcx instructions. See Section 9.8.1, "Support for the 
Iwarx/stwcx. Instruction Pair." 

Timing Comments Assertion/Negation — Occurs synchronously with respect to bus 

clock cycles. The execution of an Iwarx instruction sets the internal 
reservation condition when the next bus transition occurs, RSRV is 
asserted. 

8.2.9.9 Driver IVIode (SC_DRIVE) 

The driver mode (SC_DRIVE) signal is input only on the MPC601. Following are state 
meaning and timing comments for the SC_DRIVE signal. 

State Meaning Asserted — Indicates that the drive current for the following output 

buffers is increased; XBH, DBF, AR'l'RY, 'SHD, TS, XNTS, 
(approximately 2x). 
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Negated — ^The drive current for the six signals above will be the 
same as all other signals for the MPC601. 

Timing Comments Assertion/Negation — ^This is not a dynamic signal; it must not 
transition after HRESET is negated. 

8.2.10 ESP/Scan Interface 

The MPC601 has extensive on-chip test capability including the following: 

• Built-in Self test (BIST) 

• Debug control/observation (ESP) 

• Boundary scan 

The built-in self test hardware is exercised as part of the POR sequence. The ESP and 
boundary scan logic are not used under typical operating conditions. 

Detailed discussion of the MPC601 test functions is beyond the scope of this document; 
however, sufficient information has been provided to allow the system designer to disable 
the test functions that would impede normal operation. 

The test interface is provided for testing. Table 8-9 describes the test interface signals. For 
more information, refer to Section 9.9, "IEEE 11 49.1 -Compatible Interface." The interface 
is shown in Figure 8-2. 



*\ 


TDI module lead #186 
TMS module lead #184 
TCK module lead #187 


+5 


>^S 






BSCAN_EN module lead #186 

TDO module lead #078 
TRST module lead #279 







Figure 8-2. IEEE 11 49.1 -Compatible Boundary Scan Interface 
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Table 8-9. ESP/Scan Interface 



signal Name 


I/O 


Timing Comments 


SCAN_CTL 




This input signal should be driven high to disable test modes. 


SCAN_CLK 




This input signal should be driven high to disable test modes. 


SCAN_SIN 




This input signal should be driven low to disable test modes. 






This input signal should be driven high to disable test modes. 


ESP_EN 






This input signal should be driven high to disable test modes. 


BSCAN_E N 


RUN_NSTOP 





This output signal is a no connect (NC) for non-test board designs. 


SCAN_OUT 





This output signal is a no connect (NC) for non-test board designs. 



8.2.11 Test Signals 

Table 8-10 describes the MPC601 's test and COP signals. The value in the operational level 
column should be used for normal operations. 



Table 8-10. Test Interface 


Signal 
Name 


Operation 
Level 


I/O 


TSTO 


Low 


1 


TST1 


Low 


1 


TST2-3 


— 





TST4 


— 





TST5 


Low 




TST6 


Low 




TST7 


High 




TST8 


High 




TST9 


Low 




TST10 


High 




TST11 


High 




TST12 


High 




TST13 


High 




TST14 


High 




TST15 


High 




TST16 


High 




TST17 


High 




TST18 


Low 
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Table 8-10. Test Interface (Continued) 



Signal 
Name 


Operation 
Level 


I/O 


TST19 


— 





TST20 


Low 


1 


TST21 


Low 


1 


TST22 


High 


1 


TST23 


High 


1 



8.2.12 Clock Signals 

The clock signal inputs of the MPC601 determine the system clock frequency and provide 
a flexible clocking scheme that allows the processor to operate at an integer multiple of the 
system clock frequency. 

Refer to the MPC601 electrical specifications for exact timing relationships of the clock 
signals. 

8.2.12.1 Double-Speed Processor Clock (2X_PCLK)— Input 

The double-speed processor clock (2X_PCLK) signal is input only on the MPC601. This 
signal is the highest frequency input to the MPC601; it switches at twice the frequency of 
the internal P_CLOCK provided that the PCLK_HN signal is half the frequency of the 
2X_PLCLK as shown in Figure 8-3. This input clocks the latch that samples the PCLK_HN 
input, providing duty-cycle control for the internal P_CLOCK (see Figure 8-3). 

Following are state meaning and timing comments for the 2X_PCLK signal. 

State Meaning Rising Edge — Is the clocking edge for a synchronizing latch used to 

generate the internal processor clock (see PCLK_EN). See 
Section 8.2.12, "Clock Signals." 

Timing Comments Duty cycle — Refer to the MPC6()1 electrical specifications. 



8.2.12.2 Clock Phase (PCLK_EN)— Input 

The clock phase (PCLK_HN) signal is input only on the MPC601. The PCLK_HN signal 
switches at the same frequency as the internal CPU clock (P_CLOCK in Figure 8-3). The 
FCEFTEN signal determines the phase of the internal P_CLOCK (timing and duty cycle 
are derived from the 2X_PCLK input); therefore, this input can be used to synchronize 
multiple MPC601S. 

Figure 8-3 shows how the internal P_CLOCK is always identical to the PCCKTEI^ signal 
except it is inverted and delayed by one full 2X_PCLK cycle. 

The MPC601 can tolerate dynamic P_CLOCK cycle stretching. This can be accomplished 
by altering the duty cycle of the PCLK_tiN input. For example, the system can extend a 
given CPU clock cycle by negating PCLK_tiN for more than one 2X_PCLK cycle. This 
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effectively delays the bus clock input sampling points and output drive points in half of a 
processor cycle increments and further delays execution of instructions accordingly. 



PCLK EN 



2X PCLK 




(internal) 
P CLOCK 



Figure 8-3. Internal P_CLOCK Generation 

Following are state meaning and timing comments for the PCLK_EN signal. 

State Meaning Asserted — Indicates that the MPC601 should generate the high 

phase of the internal processor clock synchronized to 2X_PCLK. 
See Section 8.2.12, "Clock Signals." 

Negated — Indicates that MPC6()1 should generate the low phase of 
the internal processor clock synchronized to 2X_PCLK. 

Timing Comments Assertion — May occur one 2X_PCLK cycle after the negation of 
PCLK_tiN with appropriate setup to the falling edge of 2X_PCLK. 

Negation — Must occur one 2X_PCLK cycle after the assertion of 
FCEFTEN with appropriate setup to the falling edge of 2X_PCLK. 



8.2.12.3 Bus Phase (BCLK_EN)— Input 

The bus phase (BCLK_EN) signal is input only on the MPC6()1 . This input determines, in 
conjunction with PCLK_EN and 2X_PCLK, the transition timing for the MPC601 bus 
interface. While all timing is derived from the rising edge of the 2X_PCLK input, the two 
phase inputs qualify the edge on which the processor and bus interface sequential logic can 
proceed. Inputs are sampled and outputs are driven with the qualified rising edge of the 
2X_PCLK input (see Figure 8-4). 



Following are state meaning and timing comments for the BCLK_EN signal. 

State Meaning Asserted — Indicates that the MPC601 must use the next rising edge 

of the internal processor clock to sample and drive the bus interface. 

Negated — Indicates that MPC601 outputs must not change state, 
inputs will not be sampled. This signal can be treated as a 
synchronous enable for the bus clock cycle clock. See 
Section 8.2.12, "Clock Signals." 
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Timing Comments Assertion/Negation — With appropriate setup and hold time to the 
2X_PCLK provided the rising edge of the internal processor clock 
coincides with the 2X_PCLK. 

Figure 8-4 through Figure 8-7 illustrate how the MPC6()1 clocking signals can be used to 
generate a logical bus clock. Note that the resulting logical bus clock is represented as an 
arrow coincident with the rising edge of the resulting signal. It should not be inferred that 
the duty cycle of the bus clock signal is 50 percent. 

Figure 8-4 shows how the clock inputs can be used to control the MPC601. Note that the 
signal IN is the output of the inverter shown in Figure 8-3. 
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Figure 8-4. Generation of internal Clocl< (INTCLK) 

Figure 8-4 shows a simple MPC601 clock implementation with the frequency of the logical 
bus clock equal to that of the P_CLK. 
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Figure 8-5. Generation of Bus Transitions— Logical Bus Clock = P_CLK 

Figure 8-6 shows the generation of the logical bus clock at one-half the frequency of the 
P CLK. 
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Figure 8-6. Generation of Bus Transitions — Logical Bus Clocl< = 1/2 P_CLK 

Figure 8-7 shows how the PCLK_EN signal can be manipulated to perform cycle stretching 
ontheMPC601. 
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Figure 8-7, Generation of Bus Transitions— Cycle Stretching 

In this document, processor clock refers to the internal P_CLOCK signal; bus clock refers 
to the clock that causes the bus transitions. 

Figure 8-5 and Figure 8-6 show two examples of the generation of bus transitions. In the 
first example, BCLK_EN is grounded (always asserted) and the bus clock period is 
equivalent to the P_CLOCK cycle period. In the second example, the BCLK_EN input is 
driven by a clock switching at PCLK_BN/2 frequency. This allows the MPC6()1 bus 
interface to run at half the frequency of the CPU P_CLOCK, easing system design 
constraints. Note that the BCLK_HN input can be divided further (with respect to 
PCLK_EN), allowing an even greater ratio between the clock- and bus-cycle frequencies. 
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To operate the bus interface slower than P_CL0CK/2, BCLK_tiN must be asserted only 
for the intended P_CLOCK window (for example, the duty cycle can be skewed such that 
the bus logic increments only once during each assertion of BCLK_EN). 

8.2.12.4 Real-Time Clock (RTC)— Input 

The real-time clock (RTC) signal is input only on the MPC601. Following are state 
meaning and timing comments for the RTC signal. 

State Meaning Rising Edge — Increments the 7.8125-MHz real-time clock in the 

MPC6()1. 

Timing Comments Duty cycle — See the MPC601 electrical specifications. 

8.3 Clocking in a Multiprocessor System 

Clocking in a multiprocessor system adds a level of complexity. The MPC6{)1 defines the 
AC timing specifications for the chip inputs and outputs to allow for a reasonable amount 
of system-level skew and still allow the chip to meet its timing goals. These timing 
specifications can be found in the MPC601 electrical specifications. 
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Chapter 9 

System Interface Operation 

This section describes the MPC601 bus interface and its operation. It shows how the 
MPC601 signals, defined in Chapter 8, "Signal Descriptions," interact to perform address 
and data transfers. 

9.1 MPC601 System Interface Overview 

The system interface performs external accesses for loading and storing data and fetching 
instructions. 

Instructions are automatically fetched from the memory system into the instruction unit 
where they are dispatched to the execution units at a maximum rate of three instructions per 
clock. Conversely, load and store instructions explicitly specify the movement of operands 
to and from the integer and floating-point register files and the memory system. 

When the MPC601 encounters an instruction or data access, it calculates the logical address 
(effective address) and uses the low-order address bits to check for a hit in the on-chip, 32- 
Kbyte cache. Operation of the cache is described in Section 9.1.1, "Operation of the On- 
Chip Cache." During the cache lookup, the memory management unit (MMU) uses the 
upper-order address bits to calculate the virtual address, from which it calculates the 
physical address. The physical address bits are then compared with the corresponding 
cache tag bits to determine if a cache hit occurred. If the access misses in the cache, the 
physical address is used to access system memory. 

In addition to the loads, stores, and instruction fetches, the MPC601 performs other read 
and write operations for table searches, cache cast-out operations when least-recently used 
sectors are written to memory after a cache miss, and cache-sector snoop push-out 
operations when a modified sector experiences a snoop hit from another bus master. 

All read and write operations are handled by the memory unit, which consists of a two- 
element read queue that holds addresses for read operations, and a three-element write 
queue that contains addresses and data for write operations. To maintain coherency, the 
write queues are included in snooping. The interface allows one level of pipelining, that is, 
there can be two outstanding reads and writes at any given time. Note that these must be 
unlike operations; for example, there cannot be two outstanding explicit load operations, 
but there can be a load and an instruction fetch. Accesses are prioritized. The operation of 
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the memory unit is described in Section 9.1.2, "Operation of the Memory Unit for Loads 
and Stores." 

Figure 9-1 shows the address path from the execution units and instruction fetcher, through 
the translation logic to the cache and system interface logic. 

The MPC601 uses separate address and data buses and a variety of control and status 
signals for performing reads and writes. The address bus is 32 bits wide and the data bus is 
64 bits wide. The interface is synchronous — all timing is derived from the equivalent of the 
rising edge of the bus clock cycle. All MPC601 inputs are sampled at and all outputs are 
driven from this edge. The bus can run at the full processor-clock frequency or at an integer 
division of the processor-clock speed. The MPC6{)1 provides a TTL-compatible interface. 

9.1.1 Operation of the On-Chip Cache 

The MPC60rs cache is a combined instruction and data (or unified) cache. It is a 
physically-addressed, virtually-indexed, 32-Kbyte cache with eight-way set associativity. 
The cache consists of eight sets of 128 sectors. Each 16-word cache line consists of two 
8-word sectors. Both sectors share the same line address tag. Cache coherency, however, is 
maintained for each sector, so there are separate coherency state bits for each sector. If one 
sector of the line is filled from memory, the MPC601 attempts to load the other sector as a 
low-priority bus operation. There is no guarantee that the other sector will be loaded. 

Because the cache on the MPC601 is an on-chip, write-back primary cache, the 
predominant type of transaction for most apphcations is burst-read memory operations, 
followed by burst- write memory operations, I/O controller interface operations, and single- 
beat (noncacheable or write-through) memory read and write operations. Additionally, 
there can be address-only operations, variants of the burst and single-beat operations 
(global memory operations that are snooped, and atomic memory operations, for example), 
and address retry activity (for example, when a snooped read access hits a modified line in 
the cache). 

The cache has one address port dedicated to instruction fetch and load/store accesses and 
one dedicated to snooping transactions on the system interface. Therefore, snooping does 
not require additional clock cycles unless a snoop hit that requires a cache status update 
occurs. 
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Figure 9-1. MPC601 Processor Block Diagram 
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9.1.2 Operation of the Memory Unit for Loads and Stores 

As shown in Figure 9-1, the memory unit includes two read-queue elements and three 
write-queue elements. The read queue buffers are used for holding addresses for read 
operations; the write queue buffers are used for holding addresses and data for write 
operations and to support such features as address pipelining, snooping, and write 
buffering, described as follows: 

• The two read-queue elements allow the system interface logic to buffer as many as 
two outstanding read operations. There are two restrictions that apply to filling the 
two read-queue elements described as follows: 

— There cannot be two outstanding load operations. 

— There cannot be two outstanding read-with-intent-to-modify instructions. 

• Note that when a read miss causes the cache to be updated, only the sector with the 
required data is guaranteed to be updated. The other sector can be updated only if 
both read-queue elements are free. The update of the other sector can be disabled by 
setting bit 26 in the HIDO register (HID[DRF]). 

• Two of the three write-queue elements, marked "A" and "B" in Figure 9-1, are 
buffers for write operations. They buffer store operations and sectors that are written 
back to memory such as when a cache location is updated after a cache miss. This 
allows the cache to be updated before the replaced sector is written back to system 
memory. 

• The third queue element, marked "snoop" in Figure 9- 1 , is provided to support high- 
priority copy-back operations that result from snoop hits to modified data (cache- 
sector snoop push-out operations while a read operation is pending on the bus). 
Snoop hits to modified data create a high-priority store operation that allows the 
processor to become bus master to store the modified data to memory, where it in 
turn is read by the snooping device. 

The data bus supports one level of pipelining. 

9.1.3 Operation of the System Interface 

Memory accesses can occur in single-beat and four-beat burst data transfers. The address 
and data buses are independent for memory accesses to support pipelining and split 
transactions. The MPC601 can pipeline as many as two transactions and has limited support 
for out-of-order split-bus transactions. 

Memory is accessed through an arbitration mechanism that allows devices to compete for 
bus mastership. This arbitration mechanism is flexible, allowing the MPC6()1 to be 
integrated into systems that implement various fairness and bus-parking procedures to 
avoid arbitration overhead. Additional multiprocessor support is provided through 
coherency mechanisms that provide snooping, external control of the on-chip cache and 
TLB, and support for a secondary cache. Multiprocessor software support is provided 
through the use of atomic memory operations. 
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Typically, memory accesses are weakly ordered — sequences of operations, including 
load/store string and multiple instructions, do not necessarily complete in the order they 
begin — maximizing the efficiency of the bus without sacrificing coherency of the data. The 
MPC601 allows read operations to precede store operations (except when a dependency 
exists, of course). In addition, the MPC601 may reorder high priority store operations 
ahead of lower priority store operations. Because the processor can dynamically optimize 
run-time ordering of load/store traffic, overall performance is improved. 

Note that the Synchronize (sync) or Enforce In-Order Execution of I/O (eielo) instruction 
can be used to enforce strong ordering. 

The following sections describe how the MPC601 interface operates, providing detailed 
timing diagrams that illustrate how the signals interact. A collection of more general timing 
diagrams are included as examples of typical bus operations. 

Figure 9-2 is a legend of the conventions used in the timing diagrams. 

This is a synchronous interface — all MPC601 input signals except the PCLK_EN signals 
are sampled relative to the rising edge of the bus clock cycle. Outputs are driven off the 
same rising edge of bus clock cycle (see the electrical specifications for exact timing 
information). 

9.1.4 I/O Controller Interface Accesses 

Memory and I/O controller interface accesses use the MPC601 signals differently. 

The MPC601 defines separate memory and I/O address spaces, or segments, distinguished 
by the segment register T-bit in the address translation logic of the MPC6()1. If the T-bit is 
cleared, the memory reference is a normal memory access and can use the virtual memory 
management hardware of the MPC601. If the T-bit is set, the memory reference is an I/O 
controller interface access. 

The function and timing of some address transfer and attribute signals (such as TT()-TT3, 
TBST, and TS1Z0-TSIZ2) are changed for I/O controller interface accesses. Additional 
controls are required to facilitate transfers between the MPC601 and intelligent I/O devices. 
I/O controller interface and memory transfers are distinguished from one another by their 
address transfer start signals — TS indicates that a memory transfer is starting and XA'i'S 
indicates that an I/O controller interface transaction is starting. 

Unlike memory accesses, I/O controller interface accesses cannot be pipelined and must be 
strongly ordered — each access occurs in strict program order and completes before another 
access can begin. For this reason, I/O controller interface accesses are less efficient than 
memory accesses. The I/O extensions also allow for additional bus pacing and multiple 
transaction operations for variably-sized data transfers (1 to 128 bytes), and they support a 
tagged, split request/response protocol. The I/O controller interface access protocol also 
requires the slave device to function as a bus master. 
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Figure 9-2. Timing Diagram Legend 

9.2 Memory Access Protocol 

Memory accesses are divided into address and data tenures. Each tenure has three phases — 
bus arbitration, transfer, and termination. The MPC601 also supports address-only 
transactions. Note that address and data tenures can overlap, as shown in Figure 9-3. 

Figure 9-3 shows that the address and data tenures are distinct from one another and that 
both consist of three phases — arbitration, transfer, and termination. Having independent 
address and data tenures allows address pipelining (indicated in Figure 9-3 by fact that the 
data tenure begins before the address tenure ends) and split-bus transactions to be 
implemented at the system level in multiprocessor systems. Figure 9-3 shows a data 
transfer that consists of a single-beat transfer of as many as 64 bits. Four-beat burst transfers 
of 32-byte cache sectors require data transfer termination signals for each beat of data. 
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Figure 9-3. Overlapping Tenures on the l\/IPC601 Bus for a Single-Beat Transfer 

The basic functions of the address and data tenures are as follows: 

• Address tenure 

— Arbitration: During arbitration, address bus arbitration signals are used to gain 
mastership of the address bus. 

— Transfer: After the MPC601 is the address bus master, it transfers the address 
on the address bus. The address signals and the transfer attribute signals control 
the address transfer. The address parity and address parity error signals ensure 
the integrity of the address transfer. 

— Termination: After the address transfer, the system signals that the address 
tenure is complete or that it must be repeated. 

• Data tenure 

— Arbitration: To begin the data tenure, the MPC601 arbitrates for mastership of 
the data bus. 

— Transfer: After the MPC601 is the data bus master, it samples the data bus for 
read operations or drives the data bus for write operations. The data parity and 
data parity error signals ensure the integrity of the data transfer. 

— Termination: Data termination signals are required after each data beat in a data 
transfer. Note that in a single-beat transaction, the data termination signals also 
indicate the end of the tenure, while in burst accesses, the data termination 
signals apply to individual beats and indicate the end of the tenure only after the 
final data beat. 

The MPC601 bus supports address-only transfers, which use only the address bus, with no 
data transfer involved. This is useful in multiprocessor environments where external 
control of on-chip primary caches and TLB entries is desirable. Additionally, the MPC601 's 
retry capability provides an efficient snooping protocol for systems with multiple memory 
systems (including caches) that must remain coherent. 
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9.2.1 Arbitration Signals 

Arbitration for botii address and data bus mastership in a multiprocessor system is 
performed by a central, external arbiter and, minimally, by the arbitration signals shown in 
Section 8.2.1, "Address Bus Arbitration Signal". Most arbiter implementations require 
additional signals to coordinate bus master/slave/snooping activities. Note that address bus 
busy (ABB) and data bus busy (DBB) are bidirectional signals. These signals are inputs 
unless the MPC6()1 has mastership of one or both of the respective buses; they must be 
connected high through pull-up resistors so that they remain negated when no devices have 
control of the buses. 

The following list describes the address arbitration signals: 

• "Br (bus request) — Assertion indicates that the MPC601 is requesting mastership 
of the address bus. 

• Ws (bus grant) — Assertion indicates that the MPC601 may, with the proper 
qualification, assume mastership of the address bus. A qualified bus grant occurs 
when Wj is asserted and ABB and ARIRY are negated. 

If the MPC601 is parked, BR need not be asserted for the qualified bus grant. 

• ABB (address bus busy) — Assertion indicates that the MPC601 is the address bus 
master. 

The following list describes the data arbitration signals: 

• DBG (data bus grant) — Indicates that the MPC601 may, with the proper 
qualification, assume mastership of the data bus. A qualified data bus grant occurs 
when T5BG is asserted while DBH, DRTRY, and ARTRY are negated. 

DBB signal is driven by the current bus master, DRTRY is only driven from the bus, 
and ARTRY is from the bus, but only for the address bus tenure associated with the 
current data bus tenure (that is, not from another address tenure). 

• DBWO (data bus write only) — Assertion indicates that the MPC601 may run the 
data bus tenure for an outstanding write address even if a read address is pipelined 
before the write address. If DBWO is asserted, the MPC601 only assumes data bus 
mastership for a pending data bus write operation (that is, the MPC601 does not take 
the data bus for a pending read operation if this input is asserted along with DBG). 
Care must be taken with DBWO to ensure the desired write is queued (for example, 
a cache-sector snoop push-out operation). 

• DBB (data bus busy) — Assertion indicates that the MPC601 is the data bus master. 
The MPC601 always assumes data bus mastership if it needs the data bus and is 
given a qualified data bus grant (see DBG). 

For more detailed information on the arbitration signals, refer to Section 8.2.1, 
"Address Bus Arbitration Signal," and Section 8.2.6, "Data Bus Arbitration 
Signals." 
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9.2.2 Address Pipelining and Split-Bus Transactions 

The MPC601 protocol provides independent address and data bus capability to support 
pipelined and split-bus transaction system organizations. Address pipelining allows new 
bus transactions to begin before the current transaction has finished by overlapping the data 
bus tenure associated with a previous address bus tenure with one or more successive 
address tenures. Split-bus transaction capability allows the address bus and data bus to have 
different masters at the same time. 

While this capability does not inherently reduce memory latency, support for address 
pipelining and split-bus transactions can greatly improve effective bus/memory 
throughput. For this reason, these techniques are most effective in shared-memory 
multiprocessor implementations where bus bandwidth is an important measurement of 
system performance. 

External arbitration is required in systems in which multiple devices must compete for the 
system bus. The design of the external arbiter affects pipelining by regulating address bus 
grant (BG), data bus grant (DBG), and AACK signals. For example, a one-level pipeline is 
enabled by asserting AACK to the current address bus master and granting mastership of 
the address bus to the next requesting master before the current data bus tenure has 
completed. Two address tenures can occur before the current data bus tenure completes. 

The MPC601 can pipeline its own transactions to a depth of one level (intraprocessor 
pipelining); however, the MPC601 bus protocol does not constrain the maximum number 
of levels of pipelining that can occur on the bus between multiple masters (interprocessor 
pipelining). The external arbiter must control the pipeline depth and synchronization 
between masters and slaves. 

In a pipelined implementation, data bus tenures are kept in strict order with respect to 
address tenures. However, external hardware can further decouple the address and data 
buses, allowing the data tenures to occur out of order with respect to the address tenures. 
This requires some form of system tag to associate the out-of-order data transaction with 
the proper originating address transaction (not defined for the MPC601 interface). 
Individual bus requests and data bus grants from each processor can be used by the system 
to implement tags to support interprocessor, out-of-order transactions. 

The MPC601 supports a limited intraprocessor out-of-order, split-transaction capability via 
the data bus write only (DBWO) signal. For more information about using DBWO, see 
Section 9.10, "Using DBWO (Data Bus Write Only)." 

^.O MUUIt^dd DUO loiiuio 

This section describes the three phases of the address tenure — address bus arbitration, 
address transfer, and address termination. 
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9.3.1 Address Bus Arbitration 

When the MPC601 needs access to the external bus and it is not parlced (W3 is negated), it 
asserts bus request (BTT) until it is granted mastership of the bus and the bus is available 
(see Figure 9-4). The external arbiter must grant master-elect status to the potential master 
by asserting the bus grant (BG) signal. The MPC601 requesting the bus determines that the 
bus is available when the ABB input is negated. When the address bus is not busy (ABB 
input is negated), BG is asserted, and the address retry (ARTRY) input is negated. This is 
referred to as a qualified bus grant. The potential master assumes address bus mastership 
when it receives a qualified bus grant by asserting ABB. 

The MPC601 also provides an internally generated address bus busy signal, which it 
logically ORs with the ABB signal received off of the bus. This internal address bus busy 
signal is asserted with any T5 or XATS signal and is negated with a valid AACK. Tis 
internally generated address bus busy signal is useful in systems that do not use ABB. 



Logical Bus Clock 




Figure 9-4. Address Bus Arbitration 

External arbiters must allow only one device at a time to be address bus master. In 
implementations in which no other device can be a master, BG can be grounded (always 
asserted) to continually grant mastership of the address bus to the MPC601. 

If the MPC601 asserts BR^ before the external arbiter asserts BG, the MPC6()1 is considered 
to be unparked, as shown in Figure 9-4. Figure 9-5 shows the parked case, where a qualified 
bus grant exists on the clock edge following a need_bus condition. Notice that the bus clock 
cycle required for arbitration is eliminated if the MPC601 is parked, reducing overall 
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memory latency for a transaction. The MPC6()1 always negates ABB for at least one bus 
clock cycle after AACK is asserted, even if it is parked and has another transaction pending. 

Typically, bus parking is provided to the device that was the most recent bus master; 
however, system designers may choose other schemes, such as providing unrequested bus 
grants in situations where it is easy to correctly predict the next device requesting bus 
mastership. 
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Figure 9-5. Address Bus Arbitration Showing Bus Parking 

When the MPC601 receives a qualified bus grant, it assumes address bus mastership by 
asserting ABB and negating the BR output signal. Meanwhile, the MPC601 drives the 
address for the requested access onto the address bus and asserts TS to indicate the start of 
a new transaction. 

When designing external bus arbitration logic, note that the MPC601 may assert BR 
without using the bus after it receives the quaUfied bus grant. For exainple, in a system 
using bus snooping, if the MPC601 asserts BR to perform a replacement copy-back 
operation, another device can invalidate that sector before the MPC601 is granted 
mastership of the bus. Once the MPC601 is granted the bus, it no longer needs to perform 
the copy-back operation; therefore, the MPC601 does not assert ABB and does not use the 
bus for the copy-back operation. Note that the MPC601 asserts BR for at least one clock 
cycle in these instances. 
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9.3.2 Address Transfer 

During the address transfer, the physical address and all attributes of the transaction are 
transferred from the bus master to the slave device(s). Snooping logic may monitor the 
transfer to enforce cache coherency (see discussion about snooping in Section 9.3.3, 
"Address Transfer Termination"). The signals used in the address transfer include the 
following signal groups (see Figure 8-1): 

• Address transfer start signal: Transfer start (TS^) 

Note that extended address transfer start (XATS) is used for I/O controller interface 
operations and has no function for memory accesses. See Section 9.6, "I/O 
Controller Interface Operation." 

• Address transfer signals: Address bus (A()-A31), address parity (AP()-AP3), and 
address parity error (APE) 

• Address transfer attribute signals: Transfer type (TT()-TT4), transfer code (TC()- 
TC3), transfer size (TSIZ0-TSIZ2), transfer burst (TBST), cache inhibit (CI), write- 
through (WT), global (GBE), and cache set element (CSE()-CSE2) 

Figure 9-6 shows that the timing for all of these signals, except TS^ and APE, is identical. 
All of the address transfer and address transfer attribute signals are combined into the 
ADDR+ grouping in Figure 9-6. The T5 signal indicates that the MPC601 has begun an 
address transfer and that the address and transfer attributes are valid (within the context of 
a synchronous bus). The MPC601 always asserts T5 (or XATS for I/O controller interface 
operations) coincident with ABB in multiprocessor systems. As an input, TS^ need not 
coincide with the assertion of ABB on the bus (that is, either TS or XATS can be asserted 
with, or on, a subsequent clock cycle after ABB is asserted; the MPC6()1 tracks this 
transaction correctly). 
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Figure 9-6. Address Bus Transfer 
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In Figure 9-6, the address transfer occurs during bus clock cycles 1 and 2 (arbitration 
occurs in bus clock cycle and the address transfer is terminated in bus clock 3). In this 
diagram, the address bus termination input, AACK, is asserted to the MPC601 on the bus 
clock following assertion of TS" (as shown by the dependency line). This is the minimum 
duration of the address transfer for the MPC601; the duration can be extended by delaying 
the assertion of AACK for one or more bus clocks. 

9.3.2.1 Address Bus Parity 

The MPC601 always generates one bit of correct odd-byte parity for each of the four bytes 
of address when a valid address is on the bus. The calculated values are placed on the AP()- 
AP3 outputs when the MPC601 is the address bus master. If the MPC601 is not the master 
and TS^ and GBL are asserted together (qualified condition for snooping memory 
operations), the calculated values are compared with the AP()-AP3 inputs. If there is no 
match, the APE output is asserted. An address bus parity error causes a checkstop condition 
if the bus parity checkstop source is enabled in HIDO. See Chapter 5, "Exceptions." 

The internal address parity error signal is further qualified by a valid address condition, 
since the APE signal may be asserted for invalid address bus conditions. APE does not, 
therefore, necessarily represent the state of the internal address parity error signal used to 
generate the machine check exception, 

9.3.2.2 Address Transfer Attribute Signals 

The transfer attribute signals include several encoded signals such as the transfer type 
(TT0-TT4) signals, transfer burst (TBST) signal, transfer size (TSIZ()-TSIZ2) signals, and 
transfer code (TC()-TC1) signals. Section 8.2.4, "Address Transfer Attribute Signals," 
describes the encodings for the address transfer attribute signals. Note that TT()-TT4, 
TBST, and TSIZ()-TSIZ2 have alternate functions for I/O controller interface operations 
(see Section 9.6, "I/O Controller Interface Operation)." 

9.3.2.2.1 Transfer Type (TT0-TT4) Signals 

Snooping logic should fully decode the transfer type signals if the GBL signal is asserted. 
Slave devices can use the individual transfer type signals without fully decoding the group. 
The transfer type signals generally have the following individual functions: 

• TTO — Special operations: The MPC6()1 drives this signal to indicate that the access 
is part of an atomic data access sequence. This signal is asserted by the MPC601 
whenever a bus transaction occurs in response to a Iwarx/stwcx. (Load Word and 
Reserve Indexed/Store Word Conditional Indexed) instruction pair (see Chapter 3, 
"Addressing Modes and Instruction Set Summary"), an eciwx or ecowx instruction, 
or for a Translation Look- Aside Buffer Invalidate Entry (tlbie) operation. 

• TTl — Read (/write) operations: The TTl signal indicates whether the transaction is 
a read (TTl high) or a write (TTl low) transaction. This is valid for transactions that 
are not address only 
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TT2 — Invalidate operations: When asserted with GBL, the TT2 output signal 
indicates that all other caches in the system should invalidate the cache entry on a 
snoop hit. If the snoop hit is to a modified entry, the sector should be copied back 
before being invalidated. 

TT3 — Memory (/address-only) operations: Except for eciwx or ecowx instructions 
(TT(>-TT3 encodings 1010 or 1 1 10) the TT3 signal, when asserted, indicates that the 
associated data transfer is to/from memory. External logic can synthesize the data 
bus request from the combination of T5 (or XATS) and TT3 (DBR=TS&TT3). If 
TT3 is not asserted with the address, the associated bus transaction is considered to 
be a broadcast operation that all bus participants must honor (or a reserved 
operation). This is an address-only transaction; the MPC601 does not need and will 
not acquire data bus ownership, even if it receives a qualified data bus grant. 
Figure 9-7 shows an address-only transaction. On the rising edge of bus cycle 2, TT3 
is not asserted; therefore, the data bus will not be needed. 

TT4 — ^The TT4 signal is reserved for future expansion. 
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Figure 9-7. Address-Only Bus Transaction 

9.3.2.2.2 Transfer Size (TSIZ0-TSIZ2) Signals 

The transfer size signals (TSIZ0-TSIZ2) indicate the size of the requested data transfer as 
shown in Table 9-1. The TSIZ0-TSIZ2 signals may be used along with TBST and A29- 
A31 to determine which portion of the data bus contains valid data for a write transaction 
or which portion of the bus should contain valid data for a read transaction. Note that for a 
burst transaction (as indicated by the assertion of TBST) TSIZ0-TSIZ2 are always set to 
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b'OlO*. Therefore, if the TEST signal is asserted, the memory system should transfer a total 
of eight words (32 bytes), regardless of the TSIZ0-TSIZ2 encoding. 

Table 9-1. Transfer Size Signal Encodings 



TB5T 


TSIZO 


TSIZ1 


TSIZ2 


Transfer Size 


Asserted 





1 





Eight-word burst 


Negated 











Eight bytes 


Negated 








1 


One byte 


Negated 





1 





Two bytes 


Negated 





1 


1 


Three bytes 


Negated 


1 








Four bytes 


Negated 


1 





1 


Five bytes 


Negated 


1 


1 





Six bytes 



The basic coherency size of the bus is defined to be 32 bytes (corresponding to one cache 
sector). Data transfers that cross an aligned, 32-byte boundary either must present a new 
address onto the bus at that boundary (for coherency consideration) or must operate as non- 
coherent data with respect to the MPC601. 

9.3.2.3 Effect of Alignment in Data Transfers 

Table 9-2 lists the aligned transfers that can occur on the MPC601 bus. These are transfers 
in which the data is aligned to an address that is an integer multiple of the size of the data. 
For example. Table 9-2 shows that one-byte data is always aligned; however, for a four- 
byte word to be aligned, it must be oriented on an address that is a multiple of four. 







Table 9-2. Aligned Data Transfers 












TSIZO 


TS1Z1 


TSIZ2 


A29-A31 


Data Bus Byte Lane(s) 


Transfer Size 





1 


2 


3 


4 


5 


6 


7 


Byte 










000 


^ 


— 


— 


— 


— 


— 


— 


— 










001 


— 


^/ 


— 


— 


— 


— 


— 


— 










010 


— 


— 


V 


— 


— 


— 


— 


— 










011 


— 


— 


— 


V 


— 


— 


— 


— 










100 


— 


— 


— 


— 


V 


— 


— 


— 










101 


— 


— 


— 


— 


— 


V 


— 


— 










110 


— 


— 


— 


— 


— 


— 


^ 


— 










111 


— 


— 


— 


— 


— 


— 


— 


< 
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Table 9-2. Aligned Data Transfers (Continued) 










TSIZO 


TSIZ1 


TSIZ2 


A29-A31 


Data Bus Byte Lane(s) 


Transfer Size 





1 


2 


3 


4 


5 


6 


7 


Half word 





1 





000 


V 


V 


— 


— 


— 


— 


— 


— 





1 





010 


— 


— 


V 


V 


— 


— 


— 


— 





1 





100 


— 


— 


— 


— 


V 


V 


— 


— 





1 





110 


— 


— 


— 


— 


— 


— 


V 


V 


Word 


1 








000 


V 


V 


V 


V 


— 


— 


— 


— 


1 








100 


— 


— 


— 


— 


V 


V 


V 


V 


Double word 











000 


V 


^/ 


V 


V 


V 


V 


V 


V 



Notes: 

V The byte portions of the requested operand that are read or written during that bus transaction. 

— These entries are not required and are ignored during read transactions and are driven with undefined 
data during ail write transactions (except non-cacheable write transfers, in which data is mirrored on both 
word lanes if the transfer does not exceed four bytes). 

The MPC601 also supports misaligned memory operations. These transfers address 
memory that is not aligned to the size of the data being transferred (such as, a word read of 
an odd byte address). Although most of these operations hit in the primary cache (or 
generate burst memory operations if they miss), the MPC601 interface supports misaligned 
transfers within a double- word (64-bit aligned) boundary, as shown in Table 9-3. Note that 
the three-byte transfer in Table 9-3 is only one example of misalignment. As long as the 
attempted transfer does not cross a double- word boundary, the MPC601 can transfer the 
data on the misaligned address (for example, a word read from an odd byte-aligned address, 
or a seven-byte read from an odd byte-aligned address). 

An attempt to address data that crosses a double-word boundary requires two bus transfers 
to access the data. This is illustrated in the last example of a three-byte transfer in Table 9-3. 
The transfer requires two accesses — the first for the last two bytes of one double-word 
address, the second for one byte from the next double-word address. The TBST, TSIZO- 
TSIZ2, and A29-A31 signals provide enough information to determine the size of the 
transfer and the data bus byte lanes involved in the misaligned transfer. 

Although misaligned transfers are supported, they may degrade performance substantially. 
In addition to the double-word straddle boundary condition, the address translation logic 
can generate substantial exception overhead when the microcoded, sequenced, load/store 
multiple and load/store string instructions access misaligned data. It is strongly 
recommended that software attempt to align code and data where possible. 
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Table 9-3. Misaligned Data Transfer (Three-Byte Examples) 




Transfer Size 


TSIZ{0-2) 


A29-A31 


Data Bus Byte Lanes 





1 


2 


3 


4 


5 


6 


7 


Tfiree bytes 


011 


000 


A 


A 


A 


— 


— 


— 


— 


— 


011 


001 


— 


A 


A 


A 


— 


— 


— 


— 


oil 


010 


— 


— 


A 


A 


A 


— 


— 


— 


oil 


oil 


— 


— 


— 


A 


A 


A 


— 


— 


oil 


100 


— 


— 


— 


— 


A 


A 


A 


— 


oil 


101 


— 


— 


— 


— 


— 


A 


A 


A 


First transfer: 
two bytes 


010 


110 


— 


— 


— 


— 


— 


A 


A 


— 


Second 
transfer: 
one byte 


001 


000 


A 


















First transfer: 
one byte 


001 


111 


— 


— 





— 


— 


— 


A 





Second 
transfer: 
two bytes 


010 


000 


A 


A 


~ 


" 


~ 









A: Byte lane used 
— : Byte lane not used 

9.3.2.4 Transfer Code (TC0-TC1) Signals 

The TCO and TCI signals provide supplemental information about the corresponding 
address. Note that the TCx signals can be used with the TT0-TT4 and THST signals to 
further define the current transaction. These encodings may be useful for debugging. 

The meaning of TCO depends on whether the current transaction is a read or write 
operation. On a read operation, TCO asserted indicates that the transaction is an instruction 
fetch operation; otherwise, the read operation is a data operation. On an MPC601 write 
operation, TCO asserted indicates that the write transfer is invalidating the associated sector. 
This is for a copy-back replacement, a Data Cache Block Flush instruction (dcbf), snoop 
that causes invalidation (for example a flush or kill). TCO negated indicates the write is not 
invalidating any cache sector (for example, a replacement sector copy-back, write-through, 
or cache-inhibited write operation.) 

The TCI signal is asserted on read and RWITM operations to indicate that a low-priority 
operation to load the sector adjacent to one that was previously loaded due to a cache miss 
is queued; therefore, the next bus transaction will likely access the same page of memory. 
This operation may not be the next transaction if, for instance, a copy-back operation that 
resulted from a snoop hit is required. Note that TCI asserted indicates to the memory 
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system the likelihood that the next access is on the same page, but it does not guarantee this 
will occur because of transfer priorities and the bus traffic/code execution dynamics. TCI 
is negated for all write operations on the bus. 

Table 9-4 shows the encodings of the TCO and TCI signals. 

Table 9-4. Transfer Code Signal Encodings 



Signal 


State 


Definition 


TCO 


Asserted 


Bus operation is an instruction fetcti 

Write: Operation is invalidating tlie caclie line in the MPC601 . 

Kill (address only): Operation is invalidating the cache line in the MPC601. 


Negated 


Bus operation is a data read 

Write: Operation is not invalidating the cache line in the MPC601. 

Kill (address only): Operation is not invalidating the cache line. 


TC1 


Asserted 


The next access is likely to be on same page. A sector has been loaded, and a 
low-priority load of the adjacent sector is queued. 


Negated 


The next access is not likely to be on the next page; an optional low-priority 
load of an adjacent sector is not queued. 



9.3.3 Address Transfer Termination 

The MPC601 does not terminate the address transfer until the AACK. (address 
acknowledge) input is asserted; therefore, the system can extend the address transfer phase 
by delaying the assertion of AACK to the MPC601. Although AACK can be asserted as 
early as the bus clock cycle following T5 (see Figure 9-8), MPC601 address transfers 
require a minimum of three bus clock cycles — enough time to negate and tristate the shared 
ARl'RY and SHD signals with no contention between devices. As shown in Figure 9-8, 
these signals are asserted for one bus clock cycle, tristated for the next bus clock cycle, 
driven high for the next 2X_PCLK cycle time, and finally tristated. Note that AACK is 
asserted for only one bus clock cycle. 



Note that precharging of the AR'i'RY and SHD signals during the negation period can be 
disabled by enabling HID[29]. After ARTRY and SHU are negated, they will be three- 
stated for two bus cycles and the system is responsible for precharging both ARTRY and 
SHD signals. This allows masters in a system that uses both 3.6-V and 5-V levels to use the 
same system bus. 



The address transfer can be terminated with the requirement to retry if ARfRY is asserted 
during the bus clock cycle following AACK. If ARl'RY is asserted in this window, the 
MPC601 negates BR in the following bus clock cycle; after that, it attempts to retry the 
address transfer. By delaying the bus request by one bus clock cycle, the protocol provides 
an opportunity for the snooping device that asserted the ARFRY to access the bus next, and 
therefore retry determinacy is possible. In order for the retry determinacy to be guaranteed, 
however, the external bus arbitration logic must ensure that the snooping device has access 
to the bus next. 
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The only valid window for the ARTRY input is the one bus clock cycle following the 
assertion of AACK. Snooping devices must monitor the assertion of AACK to know when 
to deassert/tristate AR'i'RY, as shown in Figure 9-8. The assertion of AR'l RY/SHD can be 
derived in one of the following ways: 

• ARTRY/SHD can be asserted on the second clock after TE is asserted 

• ARTRY/SHD can be asserted before AACK is asserted, but is not qualified by the 
master MPC601 until the clock after AACK is asserted 




Figure 9-8. Snooped Address Cycle with ARTRY 



The MPC601 requires that the first (or only) TX not be asserted before AACK (note that 
TK can be held off directly by the slave device delaying AACK assertion or indirectly by 
an external arbiter delaying DBG assertion). This requirement guarantees the relationship 
between TK and ARTRY/SHD such that, in the case of an address retry, the MPC601 can 
purge the data/instructions from its data path queues and waive off the data/instructions 
before they are forwarded to the cache/CPU. 

When the data tenure begins before the address tenure is complete, if the MPC601 has 
asserted DBB, assertion of ARTRY causes the MPC6()1 to terminate the data bus 
transaction and retry both the address and data tenures later. If the transfer is a single-beat 
transfer and TA occurs as early as the AACK window, there is no indication of an early data 
bus termination. However, if a burst transaction is in progress, the MPC601 negates DBB 
early in response to ARTRY. The system logic does not need to assert TA for four bus clock 
cycles in this case. 
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IfDBU is not asserted until the AKi'RY window and ARTRY is asserted, the MPC6()1 does 
not become data bus master. Note that some system designs, such as single-master systems, 
do not require the use of AR'I'RY. 



For information about ARl'RY scenarios, see Section 9.3.3.1, "Address Retry Sources." 
For information about MESI protocol and its effect on address tenure termination, refer to 
Section 9.4.4, "Memory Coherency — MESI Protocol." 

9.3.3.1 Address Retry Sources 

The assertion of the SHD and ARTRY input signals provide sufficient information for the 
appropriate handling of cache sector coherency. They encode information about a 
transaction, as shown in Table 9-5. 

Table 9-5. Address Retry Causes 



gHD 


OTTR7 


Definition 


High impedance 


High impedance 


Exclusive. No snoop hit. Pipeline not busy. 


Negated 


Asserted 


Pipeline busy. Queuing retry 


Asserted 


Negated 


Snoop hit (shared) 


Asserted 


Asserted 


Snoop hit (modified) 



If the SHD and ARTRY inputs are not asserted for a cache-sector fill operation, the sector 
is marked as exclusive (see Section 9.4.4, "Memory Coherency — MESI Protocol"). If the 
SHD input is asserted without ARTRY, the sector is marked as shared. 

NOTE: If the invalidate (TT2) input signal is asserted for the transaction, the sector is 
marked exclusive regardless of the state of the SHD signal. If ARl'RY is asserted without 
SHD, a device cannot service the address transaction currently (because of queuing 
constraints) and the transaction is retried later. The MPC601 reacts to the assertion of 
AR'IRY the same way, regardless of the state of SHD. The timing of the SHD input is the 
same as the timing for ARTRY. 



One or more devices can indicate a queuing retry condition by asserting ARTRY while one 
or more devices separately indicate the snoop-hit shared condition by asserting SHD. This 
condition appears as a snoop hit modified condition on the bus, since both SHD and 
AR'I'RY are asserted. This is not a problem for the MPC601 since ARTRY is not qualified 
by Snn (that is, SHU is a don't care if ARTRY is asserted to the MPC601). 



9.4 Data Bus Tenure 

This section describes the data bus arbitration, transfer, and termination phases defined by 
the MPC601 memory access protocol. The phases of the data tenure are identical to those 
of the address tenure, underscoring the symmetry in the control of the two buses. 
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9.4.1 Data Bus Arbitration 

Data bus arbitration uses the data arbitr ation si gnal group, that is, DBG, DBWO, and DBB. 
Additionally, the combination ofTS or XAI'S and TT3 (address-only signal) function as a 
data bus request. 

The T5 signal is an implied data bus request from the MPC601 ; the arbiter must qualify T^ 
with the transfer type (TT) encodings to determine if the current address transfer is an 
address-only operation, which does not require a data bus transfer (see Figure 9-7). If the 
data bus is needed, the arbiter grants data bus mastership by asserting the DBG input to the 
MPC601. As with the address-bus arbitration phase, the MPC6()1 must qualify the DBG 
input with a number of input signals before assuming bus mastership, as shown in 
Figure 9-9. 




Figure 9-9. Data Bus Arbitration 

A qualitied data bus grant can be expressed as the following: 

QDBG = UBU asserted while DBF, DRTRY, and ARIRY are negated 



When a data tenure overlaps with its associated address tenure, a qualitied ARTRY 
assertion coincident with a data bus grant does not result in data bus mastership (DBB is 
not asserted). Otherwise, the MPC6()1 always asserts DBB on the bus clock cycle after 
recognition of a qualified data bus grant. Since the MPC601 can pipeline transactions, there 
may be an outstanding data bus transaction when a new address transaction is retried. In 
this case, the MPC601 becomes the data bus master to complete the previous transaction. 
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9.4.1.1 Using the MH Signal 

The DBB signal should be connected between masters only if data tenure hand-off is left 
to the masters. The memory system can control data hand-off directly with DBG. 

The MPC601 asserts DBB throughout the data transaction; however, the MPC601 does not 
park the data bus and assert DBB across multiple transactions. DBB is negated on the bus 
clock cycle after a final TTCis received from the bus. 

9.4.2 Data Transfer 

The data transfer signals include DHf)-DH31, DL()-DL31, DP()-DP7 and DPE. For 
memory accesses, the DH and DL signals form a 64-bit data path for read and write 
operations. 

The MPC6()1 transfers data in either single- or four-beat burst transfers. Single-beat 
operations can transfer from one to eight bytes at a time and can be misaligned (see 
Section 9.3.2.3, "Effect of Alignment in Data Transfers"). Burst operations always transfer 
eight words and are aligned to four- or eight-word address boundaries. Burst transfers can 
achieve significantly higher bus throughput than single-beat operations. 

The type of transaction initiated by the MPC601 depends on whether the code or data is 
cacheable and, for store operations, whether the cache is operated in write-back or write- 
through mode which the MMU controls at either the page or block basis. Burst transfers 
support cacheable operations only; that is, memory structures must be marked as cacheable 
(and write-back for data store operations) in the respective TLB entry to take advantage of 
burst transfers. 

The MPC601 output TBST indicates to the system whether the current transacdon is a 
single- or four-beat transfer. A burst transfer has an assumed address order. For load or store 
operations that miss in the cache (and are marked as cacheable and, for stores, write-back 
in the MMU), the MPC601 presents the quad-word-aligned address associated with the 
critical code or data that initiated the transaction. This minimizes latency by allowing the 
critical code or data to be forwarded to the processor before the rest of the sector is filled. 
For all other burst operations, however, the sector is transferred beginning with the oct- 
word aligned data. Note that this difference can complicate cache-to-cache 
implementations. 

The MPC601 does not directly support interfacing to subsystems with less than a 64-bit 
data path (except for I/O controller interface operations, which are discussed in Secdon 9.6, 
"I/O Controller Interface Operation"). However, the MPC601 duplicates, or mirrors, the 
transfer data on the unused word lane, for store operations to pages marked as non- 
cacheable. This means, for example, that for a non-cacheable byte store operation, the valid 
byte is present on two byte lanes — one in the upper word and one in the lower word. For a 
word store operation, the word is mirrored across both word lanes. Unused byte lanes are 
undefined. 
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The data is not mirrored, however, for other store operations (including write-through). A 
cache hit causes the double word of data containing the data being transferred to be output 
on the data bus lanes. 

CAUTION 

While this information may be useful to some applications that 
do not cache data structures, data mirroring may not be 
supported on future versions of the MPC601 or other PowerPC 
processors. 

9.4.3 Data Transfer Termination 

Four signals are used to terminate data bus transactions: TK, DRTRY (data retry), TEA 
(transfer error acknowledge), and in some cases ARl'RY. The TK signal indicates normal 
termination of data transactions. DRTRY indicates invalid read data in the previous bus 
clock cycle. TEA indicates a non-recoverable bus error event. ARTRY can also terminate 
a data bus transaction, only if it occurs before the first assertion of TA. 

9.4.3.1 Normal Single-Beat Termination 

Normal termination of a single-beat data read operation occurs when TX is asserted by a 
responding slave. The TEK and DRTRY signals must remain negated during the transfer 
(see Figure 9-10). 




Figure 9-10. Normal Single-Beat Read Termination 

Normal termination of a single-beat data write transaction occurs when TK is asserted by 
a responding slave. TEA must remain negated during the transfer. The DRTRY signal is 
not sampled during data writes, as shown in Figure 9-1 1. As shown in both Figure 9-10 and 
Figure 9-11, the TTl signal driven low by the MPC601 indicates a write is in progress. 
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Figure 9-11. Normal Single-Beat Write Termination 

Normal termination of a burst transfer occurs when TA is asserted during four bus clock 
cycles, as shown in Figure 9-12. The bus clock cycles need not be consecutive, thus 
allowing pacing of the data transfer beats. For read bursts to terminate successfully, TEA 
and DRTRY must remain negated during the transfer. For write bursts, TEA must remain 
negated during the transfer. "DRTRY is ignored during data writes. 




Figure 9-12. Normal Burst Transaction 



For read bursts, DRTRY may be asserted one bus clock cycle after TK is asserted to signal 
that the data presented with TS is invalid and that the processor must wait for the negation 
of DRTRY before forwarding data to the processor (see Figure 9-13). Thus, a data beat can 
be speculatively terminated with TA and then one bus clock cycle later confirmed with the 
negation of DRTRY. The DRTRY signal is valid only for read transactions. TK must be 
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asserted on the bus clock cycle before the first bus clock cycle of the assertion of DRTRY; 
otherwise the results are undefined. 



The DKTkY signal extends data bus mastership such that other processors cannot use the 
data bus until DRTRY is negated. Therefore, in the example in Figure 9-13, DBB cannot 
be asserted until bus clock cycle 5. This is true for both read and write operations even 
though DRTRY does not hold the master on write operations. 




Figure 9-13. Termination with DRTRY 



Figure 9-14 shows the effect of using DRTRY during a burst read. It also shows the effect 
of using TX to pace the data transfer rate. Notice that in bus clock cycle 3 of Figure 9-14, 
TXis negated for the second data beat. The MPC601 data pipeline does not proceed until 
bus clock cycle 4 when the TX is reasserted. 



Note that DRTRY is useful for systems that implement speculative forwarding of data such 
as those with direct-mapped, second-level caches where hit/miss is determined on the 
following bus clock cycle, or for parity- or ECC-checked memory systems. 

Note that DRTRY may not be implemented on other PowerPC processors. 

9.4.3.2 Data Transfer Termination Due to a Bus Error 

The TEK signal indicates that a bus error occurred. It may be asserted while DBB (and/or 
DRTRY for read operations) is asserted. Asserting TEK to the MPC601 terminates the 
transaction; that is, further assertions of TK and DRi'RY are ignored and DBB is negated. 
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Figure 9-14. Read Burst with TA Wait States and DRTRY 

Assertion of the TEK signal causes a machine-check exception (and possibly a check-stop 
condition within the MPC601). For more information, see Section 5.4.2, "Machine Check 
Exception (x'(X)2{)0')." However assertion of TEA does not invalidate data entering the 
GPR or the cache; therefore, the MPC6()1 may act on invalid code/data (although the 
exception will eventually be recognized, if enabled). Additionally, the corresponding 
address of the access that caused TEK to be asserted is not latched by the MPC601. To 
recover from this condition, the MPC601 must be reset; therefore, this function should only 
be used to flag fatal system conditions to the processor (such as parity or uncorrectable ECC 
errors). 

After the MPC601 has committed to run a transaction, that transaction must eventually 
complete. The separate address and data bus grants and address retry cause the transaction 
to be restarted; TX wait states and DRTRY assertion for reads delay termination of 
individual data beats. Eventually, however, the system must either terminate the transaction 
or assert the TEA signal to put the MPC601 into checkstop mode. For this reason, care must 
be taken to check for the end of physical memory and the location of certain system 
facilities. 

Note that TEA generates a machine-check exception depending on the ME bit in the MSR. 
Setting the checkstop enable control bits properly leads to a true checkstop condition. 

Note also that the MPC601 does not implement a synchronous error capability for memory 
accesses (see Section 9.6, "I/O Controller Interface Operation"). This means that the 
exception instruction pointer does not point to the memory operation that caused the 
assertion of TEA, but to the instruction about to be executed (perhaps several instructions 
later). 
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9.4.4 Memory Coherency— MESI Protocol 

The MPC601 provides dedicated hardware to provide memory coherency by snooping bus 
transactions. The address retry capability enforces the four-state, MESI cache-coherency 
protocol (see Figure 9-15). In addition to the hardware required to monitor bus traffic for 
coherency, the MPC601 has a cache port dedicated to snooping so that comparing cache 
entries to address traffic on the bus does not tie up the MPC6()rs on-chip cache. 

The global (GBL) signal output, indicates whether the current transaction must be snooped 
by other snooping devices on the bus. Address bus masters assert GBL to indicate that the 
current transaction is a global access (that is, an access to memory shared by more than one 
processor/cache). If GBL is not asserted for the transaction, that transaction is not snooped. 
When other devices detect the GBL input asserted, they must respond by snooping the 
broadcast address. 

Normally, GBL reflects the M-bit value specified for the memory reference in the 
corresponding translation descriptor(s). Note that care must be taken to minimize the 
number of pages marked as global, because the retry protocol discussed in the previous 
section is used to enforce coherency and can require significant bus bandwidth. 

When the MPC601 is not the address bus master, GBL is an input. The MPC601 snoops a 
transaction if TS^ and GBL are asserted together in the same bus clock cycle (this is a 
qualified snooping condition). No snoop update to the MPC601 cache occurs if the snooped 
transaction is not marked global. This includes invalidation cycles. 

When the MPC601 detects a qualified snoop condition, the address associated with the TS 
is compared against the unified cache tags through a dedicated cache-tag port. Snooping 
completes if no hit is detected. If, however, the address hits in the cache, the MPC601 reacts 
according to the MESI protocol shown in Figure 9-15, assuming the WIM bits are set to 
write-back mode, caching allowed, and coherency enforced (WIM = (X)l). 

Note that, in Figure 9-15, write hits to clean lines of non-global pages do not generate 
invalidate broadcasts. There are several types of bus transactions that involve the 
movement of data that can no longer access the TLB M-bit (for example, replacement 
sector copy-back, snoop push, and table-search operations). In these cases, the hardware 
cannot determine whether the sector was originally marked global; therefore, the MPC601 
marks these transactions as non-global to avoid retry deadlocks. 

The MPC60rs on-chip cache is implemented as an eight-way set-associative cache. To 
facilitate external monitoring of the internal cache tags, the cache set element (CSEO- 
CSE2) signals indicate which sector of the cache set is being replaced on read operations 
(including RWITM). Note that these signals are valid only for MPC601 burst operations; 
for all other bus operations, the CSE signals should be ignored. Table 9-6 shows the CSE 
encodings. 
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BUS TRANSACTIONS 



RH = Read Hit 

RMS = Read Miss, Shared 

RME = Read Miss, Exclusive 

WH = Write Hit 

WM = Write Miss 

SHR = Snoop Hit on a Read 

SHW = Snoop Hit on a Write or 

Read-with-lntent-to-Modify 



(T\ = Snoop Push 

(^ = Invalidate Transaction 

^ = Read-with-lntent-to-Modify 

(J) = Cache Sector Fill 



Figure 9-15. MESI Cache Coherency Protocol— State Diagram (WIM = 001) 



Table 9-6. CSE(0-2) Signals 



CSE0-CSE2 


Cache Set Element 


000 


SetO 


001 


Setl 


010 


Set 2 


oil 


Sets 


100 


Set 4 


101 


Sets 


110 


Sets 


111 


Set? 
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9.5 Timing Examples 



This section sliovvs timing diagrams for various scenarios. For information about 
conventions used in these diagrams, refer to Figure 9-2. 

Figure 9-16 illustrates the fastest single-beat reads. Note that all bidirectional signals go to 
high-impedance between bus tenures. 

10 I 11 I 12 I 






K 



A0-A31 
TT0-TT3 



TEST 



mu 



7^7^^ 



ARTRy 

UBB 
D0-D63 



UFrmv 



\ 



y 



- ( CPU A ) - 
-( R9ad y- 



\ 



f 



- { CPU A y 

- { Read ) - 



rvT 



\ 



J 



-( Rsad )- 



J~Z 



z^ 



-cty 



- on - 



7~rx 



rrr 



iz: 



j~rT 




I 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I 11 I 12 

Figure 9-16. Fastest Single-Beat Reads 
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Figure 9-17 illustrates the fastest single-beat writes. Note that all bidirectional signals go to 
high-impedance between bus tenures, TT0-TT3 are binary encoded b'xOOl'. TTO can be 
either or 1 , TTl and TT2 are 0, and TT3 is 1 . 
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Figure 9-17. Fastest Single-Beat Writes 

Figure 9-18 shows three ways to delay single-beat reads showing data-delay controls: 

• The TK hold-off can be used to insert wait states in clock cycles 3 and 4. 

• For the second access, DBG could have been asserted in clock cycle 6. 

• In the third access, DRTRY is asserted in clock cycle 1 1 to flush the previous data. 
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Note that all bidirectional signals go to high-impedance between bus tenures. Note also that 
two loads cannot be pipelined. The pipelining shown in Figure 9-17 can occur if the second 
access is not another load, (for exainple, an instruction fetch). 
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Figure 9-18. Single-Beat Reads Showing Data-Delay Controls 
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Figure 9-19 shows data-delay controls in a single-beat write. Note that all bidirectional 
signals are set to high impedance between bus tenures. Data transfers are delayed in the two 
following ways: 

• The TK holdoff is used to insert wait states in clocks 3 and 4. 

• In clock 6, DBG is held negated, delaying the start of the data tenure. 



The last access is not delayed (DRTRY is valid only for read operations). 
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Figure 9-19. Single-Beat Writes Showing Data Delay Controls 
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Figure 9-20 shows three single-beat transfers back-to-bacic. Note that all bidirectional 
signals are set at high-impedance state between tenures. 
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Figure 9-20. Back-to-Back Single-Beat Transfers 
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Figure 9-21 shows the use of data-delay controls with burst transfers. Note that all 
bidirectional signals are set to high impedance between bus tenures. Note the following: 

• The first data beat of bursted read data (clock 0) is the critical quad word. 

• The write burst shows the use of TK holdoff on the third data beat. 

• The final read burst shows the use of DRTRY on the third data beat. 

• The address for the third transfer is held off until the first transfer completes. 
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Figure 9-21. Burst Transfers with Data Delay Controls 
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Figure 9-22 shows the use of the TEA signal. Note that all bidirectional signals are set to 
high impedance between bus tenures. Note the following: 

• The first data beat of the read burst (in clock 0) is the critical quad-word. 

• The TEA signal truncates the burst write transfer on the third data beat. 

• The MPC6()1 eventually interrupts on the TEA event. 
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Figure 9-22. Use of Transfer Error Acknowledge (TEA) 

9.6 I/O Controller Interface Operation 

The MPC601 defines separate memory and I/O address spaces, or segments, distinguished 
by the segment register T-bit in the address translation logic of the MPC601 . If the T-bit is 
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cleared, the memory reference is a normal memory access and can use the virtual memory 
management hardware of the MPC601. If the T-bit is set, the memory reference is an I/O 
controller interface access. 

There are several architectural ramifications of I/O controller interface accesses, such as the 
following: 

• I/O controller interface accesses must be strongly ordered; for example, these 
accesses must run on the bus strictly in order with respect to the instruction stream. 

• I/O controller interface accesses must provide synchronous error reporting. 
Chapter 4, "Cache and Memory Unit Operation," describes architectural aspects of 
I/O controller interface segments, as well as an overview of the PowerPC's 
segmented address space management. 

The MPC601 defines two types of I/O controller interface segments (segment register T-bit 
set) based on the value of the bus unit ID (BUID). See Secfion 9.6.2, "I/O Controller 
Interface Transaction Protocol Details," for more information about the BUID. 

• I/O controller interface (BUID -^ x'()7F') — I/O controller interface accesses include 
all transactions between the MPC601 and subsystems (referred to as bus unit 
controllers (BUCs) mapped through I/O controller interface address space). 

• Memory-forced I/O controller interface (BUID = x'07F') — Memory-forced I/O 
controller interface operations access memory space. They do not use the extensions 
to the memory protocol described for I/O controller interface accesses, and they 
bypass the page- and block-translation and protection mechanisms. The physical 
address is found by concatenating bits 28-3 1 of the respective segment register with 
bits 4-3 1 of the effective address. This address is marked as non-cacheable, write- 
through, and global. 

• Because memory-forced I/O controller interface accesses address memory space, 
they are subject to the same coherency control as other memory reference 
operations. More generally, accesses to memory-forced I/O controller interface 
segments are considered to be cache-inhibited, write-through and memory-coherent 
operations with respect to the MPC601 cache and bus interface. 

The MPC6()1 has a single bus interface to support accesses to both memory accesses and 
I/O controller interface segment accesses. 
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The system recognizes the assertion of the TS signal as the start of a memory access. The 
assertion of XATS indicates an I/O controller interface access. This allows memory devices 
to ignore I/O controller interface transactions. If XATS is asserted, the access is to I/O 
space and the following extensions to the memory access protocol apply: 

• A new set of bus operations are defined. The transfer type, transfer burst, and 
transfer size signals are redefined for I/O controller interface operations; they 
convey the opcode for the I/O transaction (see Table 9-7). 

• There are two beats of address for each I/O controller interface transfer. The tirst 
beat (packet 0) provides basic address information such as the segment register and 
the sender tag; the second beat (packet 1) provides additional addressing bits and 
several control bits from the segment register. 

• Explicit sender/receiver tags are provided. 

• The sender that initiated the transaction must wait for a reply from the receiver bus- 
unit controller (BUC) before starting a new operation. 

• The MPC6{)1 does not burst I/O controller interface transactions, but streaming is 
permitted. Streaming (in this context) allows multiple single-beat transactions to 
occur before a reply from the I/O receiver is required. 

I/O controller interface transactions use separate arbitration for the split address and data 
buses and define address-only and single-beat transactions. The address-retry vehicle is 
identical, although there is no hardware coherency support for I/O controller interface 
transactions. ARTRY is useful, however, for pacing MPC601 transactions, effectively 
indicating to the MPC601 that the BUC is in a queue-full condition and cannot accept new 
data. 

In addition to the extensions noted above, there are fundamental differences between 
memory and I/O controller interface operations. For example, use of DRTRY is undefined 
for MPC601 I/O controller interface operations. Additionally, only half of the 64-bit data 
path is available for MPC6()1 I/O controller interface transactions. This lowers the pin- 
count for I/O interfaces but generally results in substantially less bandwidth than memory 
accesses. Additionally, load/store instructions that address I/O controller interface 
segments cannot complete successfully without an error-free reply from the addressed 
BUC. Because normal I/O controller interface accesses involve multiple I/O transactions 
(streaming), they are likely to be very long latency instructions; therefore, I/O controller 
interface operations usually stall MPC6()1 instruction issue. 

Figure 9-23 shows an I/O controller interface tenure. Note that the I/O response is an 
address-only bus transaction. 

The decision on whether to map I/O peripherals into memory or I/O controller interface 
space depends on many factors; however, it should be noted that in the best case, the use of 
the MPC601 I/O controller interface protocol degrades performance and requires the 
addressed controllers to implement MPC6()1 bus master capability to generate the reply 
transactions. 
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Figure 9-23. I/O Controller Interface Tenures 

9.6.1 I/O Controller Interface Transactions 

Seven I/O controller interface transaction operations are defined by the MPC601, as shown 
in Table 9-7. These operations permit communication between the MPC601 and BUCs. A 
single MPC601 store or load instruction (that translates to an I/O controller interface 
access) generates one or more I/O controller interface operations (two or more I/O 
controller interface operations for loads) from the MPC6()1 and one reply operation from 
the addressed BUG. 

Table 9-7. I/O Controller Interface Bus Operations 



Operation 


Address Only 


Direction 


Load start (request) 


Yes 


MPC601 => lO 


Load immediate 


No 


MPC601 =* 10 


Load last 


No 


MPC601 => 10 


Store immediate 


No 


MPC601 => 10 


Store last 


No 


IVIPC601 => lO 


Load reply 


Yes 


10 => MPC601 


Store reply 


Yes 


10 => MPC601 
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For the first beat of the address bus, the extended address transfer code (XATC), contains 
the I/O opcode as shown in Table 9-8; the opcode is formed by concatenating the transfer 
type, transfer burst, and transfer size signals defined as follows: 

XATC = TT(()-3)llTH5TllTSIZ({)-2) 

Table 9-8. I/O Controller Interface Bus Operations (XATC Encodings) 



Operation 


XATC 


Load request 


0100 0000 


Load immediate 


0101 0000 


Load last 


0111 0000 


Store immediate 


0001 0000 


Store last 


0011 0000 


Load reply 


1100 0000 


Store reply 


1000 0000 



9.6.1.1 Store Operations 

There are three operations defined for I/O controller interface store operations from the 
MPC601 to the BUG, defined as follows: 

1 . Store immediate operations transfer up to 32 bits of data 

2. Store last operations transfer up to 32 bits of data each from the MPC601 to the BUG 

3. Store reply from the BUG reveals the success/failure of that I/O confi^oller interface 
access to the MPG601. 

An I/O controller interface store access consists of one or more data transfer operations 
followed by the I/O store reply operation from the BUG. If the data can be transferred in 
one 32-bit data transaction, it is marked as a store last operation followed by the store reply 
operation; no store immediate operation is involved in the transfer, as shown in the 
following sequence: 

STORE LAST (from MPG601) 



STORE REPLY (from BUG) 
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However, if more data is involved in the I/O controller interface access, there will be one 
or more store immediate operations. The BUC can detect when the last data is being 
transferred by looking for the store last opcode, as shown in the following sequence: 

STORE IMMEDIATE(s) 



STORE LAST 



STORE REPLY 

9.6.1 .2 Load Operations 

I/O controller interface load accesses are similar to store operations, except that the 
MPC601 latches data from the addressed BUC rather than supplying the data to the BUC. 
As with memory accesses, the MPC601 is the master on both load and store operations; the 
external system must provide the data bus grant to the MPC601 when the BUC is ready to 
supply the data to the MPC601. 

The load request I/O controller interface operation has no analogous store operation; it 
informs the addressed BUC of the total number of bytes of data that the BUC must provide 
to the MPC6()1 on the subsequent load immediate/load last operations. For I/O controller 
interface load accesses, the simplest, 32-bit (or fewer) data transfer sequence is as follows: 

LOAD REQUEST 



LOAD LAST 



LOAD REPLY(from BUC) 
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However, if more data is involved in tiie I/O controller interface access, there will be one 
or more load immediate operations. The BUG can detect when the last data is being 
transferred by looking for the load last opcode, as seen in the following sequence: 

LOAD REQUEST 



LOAD IMM(s) 



LOAD LAST 



LOAD REPLY 

Note that three of the seven defined operations are address-only transactions and do not use 
the data bus. However, unlike the memory transfer protocol, these transactions are not 
broadcast from one master to all snooping devices; The I/O controller interface address- 
only transaction protocol strictly controls communication between the MPC601 and the 
BUG. 

9.6.2 I/O Controller Interface Transaction Protocol Details 

As mentioned previously, there are two address-bus beats corresponding to two packets of 
information about the address. The two packets contain the sender and receiver tags, the 
address and extended address bits, and extra control and status bits. The two beats of the 
address bus (plus attributes) are shown at the top of Figure 9-24 as two packets. The first 
packet, packet 0, is then expanded to depict the XATG and address bus information in 
detail. 

9.6.2.1 Packet 

Figure 9-24 shows the organization of the first packet in an I/O controller interface 
transaction. 

The XATG contains the I/O opcode, as discussed earlier and as shown in Table 9-8. The 
address bus contains the following: 



Key bit II segment register II sender tag 
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Figure 9-24. I/O Controller Interface Operation— Packet 

The XATC contains the I/O opcode, as discussed earlier and as shown in Table 9-8. The 
address bus contains the following: 

Key bit II segment register II sender tag 

This information is organized as follows: 

• Bits and 1 of the address bus are reserved — the MPC601 always drives these bits 
to zero. 

• Key bit — Bit 2 is the key bit from the segment register (either SR[Ku] or SR[Ks]). 
Ku indicates user-level access and Ks indicate supervisor-level access. The MPC601 
multiplexes the correct key bit into this position according to the current operating 
context (user or supervisor). 

• Segment register — ^Address bits 3-27 correspond to bits 3-27 of the segment 
register. Note that address bits 3-1 1 form the nine-bit receiver tag. Software must 
initialize these bits in the segment register to the ID of the BUC to be addressed; they 
are referred to as the BUID (bus unit ID) bits. 

• PID (sender tag) — Address bits 28-3 1 form the four-bit sender tag. These bits come 
from bits 28-31 of the MPC601 PID (processor ID) register. A four-bit tag allows a 
maximum of 16 processor IDs to be defined for a given system. If more bits are 
needed for a very large multiprocessor system, for example, it is envisioned that the 
second-level cache (or equivalent logic) can append a larger processor tag as 
needed. The BUC addressed by the receiver tag should latch the sender address 
required by the subsequent I/O reply operation. 

9.6.2.2 Packet 1 

The second address beat, packet 1 , transfers byte counts and the physical address for the 
transaction, as shown in Figure 9-25. 
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Figure 9-25. I/O Controller Interface Operation— Packet 1 

For packet 1, the XATC is detined as follows: 

• Load request operations — XATC contains the total number of bytes to be transferred 
(128 bytes maximum for MPC601) 

• Immediate/last (load or store) operations — XATC contains the current transfer byte 
count (one to four bytes.) 

Address bits (>-31 contain the physical address of the transaction. The physical address is 
generated by concatenating segment register bits 28-31 with bits 4-31 of the effective 
address, as follows: 

Segment register (bits 28-31) II effective address (bits 4-31) 

While the MPC601 provides the address of the transaction to the BUC, the BUG must 
maintain a valid address pointer for the reply. 

9.6.3 I/O Reply Operations 

BUCs must respond to MPC601 I/O controller interface transactions with an I/O reply 
operation. The purpose of this reply operation is to inform the MPC601 of the success or 
failure of the attempted I/O controller interface access. This requires the system I/O 
controller interface to have MPC601 bus mastership capability — a substantially more 
complex design task than bus slave implementations that use memory-mapped I/O access. 

Reply operations from the BUC to the MPC601 are address-only transactions. As with 
packet of the address bus on MPC601 I/O controller interface operations, the XATC 
contains the opcode for the operation (see Table 9-8). Additionally, the I/O reply operation 
transfers the sender/receiver tags in the first beat. 
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Figure 9-26. I/O Reply Operation 

The address bits are described in Table 9-9. 

Table 9-9. Address Bits for I/O Reply Operations 



Address Bits 


Description 


0-1 


Reserved. These bits should be set to zero for compatibility with future PowerPC microprocessors. 


2 


Error bit. It is set if the BUG records an error in the access. 


3-11 


BUID. Sender tag of a reply operation. Corresponds with bits 3-11 of one of the fvlPC601 segment 
registers. 


12-27 


Address bits 1 2-27 are BUC-specific and are ignored by the MPC601 . 


28-31 


PID (receiver tag). The MPC601 effectively snoops operations on the bus and, on reply operations, 
compares this field to bits 28-31 of the PID register to determine if it should recognize this I/O reply. 



The second beat of the address bus is reserved; the XATC and address buses should be 
driven to zero to preserve compatibility with future protocol enhancements. 

The following sequence occurs when the MPC601 detects an error bit set on an I/O reply 
operation: 

1 . The MPC6()1 completes the instruction that initiated the access. 

2. If the instruction is a load, the data is forwarded onto the register file(s)/sequencer. 

3. An I/O controller interface error exception is generated, which transfers MPC6()1 
control to the I/O controller interface error exception handler to recover from the 
error. Refer to Section 5.4.10, "I/O Controller Interface Error Exception 
(x'(X)AOO')," for more information. 

If the error bit is not set, the MPC601 instruction that initiated the access completes and 
instruction execution resumes. 
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System designers should note the following: 

• "Misplaced" reply operations (that match the processor tag and arrive unexpectedly) 
cause a checkstop condition. Refer to Chapter 5, "Exceptions," for more 
information. 

• External logic must assert AACK for the MPC6()1 , even though it is the receiver of 
the reply operation. AACK is an input-only to the MPC6()1. 

• The MPC601 monitors address parity when enabled by software and XATS and 
reply operations (load or store). 

9.6.4 I/O Controller Interface Operation Timing 

The following timing diagrams show the sequence of events in a typical MPC601 I/O 
controller interface load access (Figure 9-27) and a typical MPC6()1 I/O controller 
interface store access (Figure 9-28). All arbitration signals except for ABB and DBB have 
been omitted for clarity. Note that for either case, the number of immediate operations 
depends on the amount and the alignment of data to be transferred. If no more than four 
bytes are being transferred, and the data is double-word aligned (that is, does not straddle 
an eight-byte address boundary), there will be no immediate operation as shown in the 
figures. 

The MPC601 can transfer as many as 128 bytes of data in one load or store instruction 
(requiring more than 33 immediate operations in the case of misaligned operands). 

In Figure 9-27, XATS is asserted with the same timing relationship as TS^ in a memory 
access. Notice, however, that the address bus (and XATC) transition on the next bus clock 
cycle. The first of the two beats on the address bus is valid for one bus clock cycle window 
only, and that window is defined by the assertion of XATS. The second address bus beat, 
however, can be extended by delaying the assertion of AACK until the system has latched 
the address. 

The load request and load reply operations shown in Figure 9-27, are address-only 
transactions, as denoted by the negated TT3 signal during their respective address tenures. 
Note that other types of bus operations can occur between the individual I/O controller 
interface operations on the bus. The MPC6()1 involved in this transaction, however, does 
not initiate any other transactions once the first I/O controller interface operation has begun 
address tenure except for cache-sector snoop push-out operations resulting from snoop hits. 

Notice that, in this example (zero wait states), 13 bus clock cycles are required to transfer 
no more than eight bytes of data. 
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Figure 9-27. I/O Controller Interface Load Access Example 

Figure 9-28 shows an I/O store access, comprised of three I/O controller interface 
operations in this example. As with the example in Figure 9-27, notice that data is 
transferred only on the 32 bits of the DH bus. As opposed to Figure 9-27, there is no request 
operation since the MPC601 has the data ready for the BUC. 

The TEA signal may be asserted on any I/O controller interface operation. If it is asserted, 
the processor enters a checkstop condition if MSR[ME] is cleared, or it will queue a 
machine check exception if ME is set. After TEA is asserted, it must be reasserted for all 
tenures associated with the current I/O controller interface operation until the load last or 
store last operation occurs. When the operation occurs, the execution unit is released to take 
the machine check exception. If the TEA signal is asserted for an I/O controller interface 
operation, the reply operations (store reply or load reply) must not occur. If it does, it causes 
a checkstop condition. If the TEA signal is not asserted with a given I/O controller interface 
operation, the result of the assertion of TEX are unpredictable. The MPC6()1 may take a 
machine check exception or cause as checkstop condition. 
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Figure 9-28. I/O Controller Interface Store Access Example 

9.7 Interrupt, Checkstop, and Reset Signals 

This section describes external interrupts, checkstop operations, and hard and soft reset 
inputs, 

9.7.1 External Interrupt 

The maskable interrupt input (INT) to the MPC6()1 eventually forces the processor to take 
the external interrupt vector if the MSR(EE) bit is set. See Chapter 5, "Exceptions," for 
more information about interrupts and exceptions. 

9.7.2 Checkstops 

The MPC601 has two checkstop signals, an input (CKSTP_1N) and an output 
(CKSTP_0U1). If CKSl P_1N is asserted, t he MPC601 ha lts operations b y gating off all 
internal clocks. The MPC6()1 does not assert CKSTP OUTifCKSlP IN if asserted. 



If CKS'rP_OU'i'is asserted, the MPC601 has checkstopped internally. The CKSTP_OU f 
signal can be asserted for various reasons including receiving a TEA signal, as the result of 
an instruction dispatch, or internal and external parity errors. For more information on 
checkstop state, refer to Section 5.4.2.2, "Checkstop State (MSR[ME] =())." 

Note that checkstop conditions can be disabled by setting bits in the HIDO register. For 
information, see Section 2.3.3.12.1, "Checkstop Sources and Enables Register — HIDO." 
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9.7.3 Reset Inputs 

The MPC6()1 has two reset inputs, described as follows: 

•HRESET (hard reset) — ^The HRESET signal is used for power-on reset sequences, or 
for situations in which the MPC6()1 must go through the entire cold-start sequence 
of self-tests. Once asserted, this input must be held asserted for a minimum of 300 
processor clock cycles to ensure that the processor has had enough time to recognize 
the input and initialize registers. This information is provided in Appendix D, 
"Reset." 

• SRESET (soft reset) — ^The soft reset input provides warm reset capability. This 
input can be used to avoid forcing the MPC601 to complete the cold start sequence. 
This can be useful to recover from such conditions as check stop or some machine- 
check states that cannot be restarted. 

When either reset input is negated and if the self-test sequence completes without error, the 
processor attempts to fetch code from the system reset exception vector. The vector is 
located at offset x'OOlOO' from the exception prefix (all zeros or ones, depending on the 
setting of the exception prefix bit in the machine state register (MSR[EP]), The EP bit is set 
for HRESET. 

9.7.4 Soft Stop Control Signals 

The soft stop control signals allow the processor to stop the clocks and bring the activity to 
a quiescent state in an orderly fashion (as opposed to a hard stop, which simply halts the 
clocks without regard to system activity). 

The soft stop state is entered by asserting the QUIESC_REQ signal. This signal allows the 
system to complete any bus activities that might be affected by stopping the clocks. When 
the system is ready to enter the soft stop state, it asserts the SYS_QU1ESC signal. At this 
time the MPC601 takes a soft stop. 

During a soft stop all internal clocking is disabled after the system activity quiesces in an 
orderly manner, that is, there are no partially finished instructions. Soft stop is typically 
used for debugging; during the soft stop, the state bits in the chip can be scanned, examined 
and scanned back in. The processor returns to normal operation when the RESUME signal 
is asserted. 

9.8 Processor State Signals 

This section describes the MPC60rs support for atomic update and storage through the use 
of the Iwarx/stwcx. opcode pair and the configuration options for the MPC601 output 
buffer. 

9.8.1 Support for the Iwarx/stwcx. Instruction Pair 

The Load Word and Reserve Indexed (Iwarx) and the Store Word Conditional Indexed 
(stwcx.)instructions provide a means for atomic memory updating. Memory can be updated 
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atomically by setting a reservation on the load and checking that the reservation is still valid 
before the store is performed. In the MPC601, the reservations are made on behalf of 
aligned, 32-byte sections of the memory address space. 

The reservation (RSRV) output signal is always driven by bus clock cycle and reflects the 
status of the reservation coherency bit in the reservation address register (see Chapter 4, 
"Cache and Memory Unit Operation," for more information). See Section 8.2.9.8, 
"Reservation (RSRV) — Output," for information about timing. 

9.9 IEEE 11 49.1 -Compatible Interface 

The MPC601 provides a boundary-scan interface compatible with IEEE 11 49.1 -compliant 
parts. Although the standard allows built-in self-test (BIST), the MPC601 interface 
supports only boundary scan. This section briefly describes the MPC601 interface and its 
differences with the IEEE 1149.1 interface. 

9.9.1 Deviations from the IEEE 1149.1 Boundary-Scan Specifications 

The MPC6()1 deviates from the IEEE 1 149.1 specifications in the foUowing ways: 

• In the IEEE 1 149.1 specifications, no mode pin is required to use the IEEE 1 149. 1 
boundary-scan interface. However, in the MPC601, the scan enable mode input 
(BSCAN_EN) signal must be asserted to run boundary-scan testing. The signal must 
be pulled up when boundary-scan testing is not being performed. 

• Whereas the IEEE 1 149.1 specifications indicate that only TCK should be used to 
clock data-register latches, in the MPC601 the processor system clock must be 
active (oscillating) during testing. 

• The MPC60 1 implements only the PRELOAD portion of the SAMPLE/PRELOAD 
function. 

• IEEE 1 149. 1 specifies that data on the primary output should be held valid while the 
processor is in the SHIFT DT state and that data should change only in the UPDATE 
DT or UPDATE IT states (assuming the instruction is valid). In the MPC6()1, no 
stable values are held on primary outputs for the SHIFT DT state. The SHIFT DT 
state forces primary outputs to high impedance. Outputs are enabled if the IT 
contains a valid instruction and the TAP is in the UPDATE DT or UPDATE IT state. 

• IEEE 1 1 49. 1 spe cifies th at asserting the TRST signal should reset only the TAR In 
the MPC601 , the TRST signal resets the TAP, system logic, and the COP 

IEEE 1 149. 1 also specifies the use of the T RST signal to disable TAR On the 
MPC601, this can be done by negating the BSCAN_EN signal, which prohibits 
resetting the TAP and system logic independently. The TRST signal should not be 
used to disable the TAP in the system functional environment; the BSCAN_EN 
signal should be used. The user can use the TRST signal as described above or hold 
TM^ high for five TCK cycles. Note that not all SRLs in the MPC601 are boundary- 
scan SRLs. The boundary-scan chain includes functional system SRLs. 
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9.9.2 Additional information about tiie iEEE 1149.1 Interface 

Note the following points concerning the IEEE 1 149.1 interface: 

• Because the driver inhibit to all COMMON 10 signals is controlled by a common 
signal, all COMMON INPUT/OUTPUT devices must inbound off-chip data or 
outbound on-chip data. 

• Not all SRLs in the boundary-scan chain are boundary-scan SRLs. 



9.10 Using DBWO (Data Bus Write Only) 

The MPC601 supports split transaction pipelined transactions. Additionally, the DBWO 
signal allows the MPC6()1 to be configured dynamically to source write data out of order 
with respect to read data. 

In general, an address tenure on the bus is followed stricdy in order by its associated data 
tenure. Transactions pipelined by a single MPC6()1 complete strictly in order. However, the 
MPC601 can run bus transactions out of order only when the external system allows the 
MPC601 to perform a cache-sector snoop push-out operation (or other write transaction, if 
pending in the MPC601 write queues) between the address and data tenures of a read 
operation through the use of DBWO. This effectively envelopes the write operation within 
the read operation. This can be useful in some external queued controller scenarios or for 
more complex memory implementations that can support so-called dump-and-run 
operations. These include the cache sector cast out of a modified sector caused by a load 
miss. A replacement copyback operation can be written to memory buffers while the 
memory location is being accessed for the line fill. The sector is written (dumped) into 
memory buffers while the memory is accessed for the load operation. Optimally, the 
replacement copy-back operation can be absorbed by the memory system without affecting 
load memory latency. Figure 9-11 gives an example of the use of the DBWO input. 

Figure 9-29 illustrates the following sequence of operations: 

1. Processor A begins a read operation. (Bus clock cycle 2) 

2. Processor B attempts a global read but is interrupted by a retry from processor A 
(bus clock cycle 7) 

3. Processor A performs a cache-sector snoop push-out operation out of order because 
of the assertion of DBWO (bus clock cycle 8) 

4. Processor B successfully performs the global read (bus clock cycle 13) 

5. Processor A successfully concludes its original read operation (bus clock cycle 16) 
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Figure 9-29. Data Bus Write-Only Transaction 

Note that although the MPC601 can pipeline any write transaction behind the read 
transaction, special care should be used when using the enveloped write feature. It is 
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envisioned that most system implementations will not need this capability; for these 
applications DBWO should remain negated. In systems where this capability is needed, 
DBWO should be asserted under the following scenario: 

1. The MPC601 initiates a read transaction (either single-beat or burst) by completing 
the read address tenure with no address retry. 

2. Then, the MPC6()1 initiates a write transaction by completing the write address 
tenure, with no address retry. 

3. At this point, if DBWO is asserted with a qualified data bus grant to the MPC601 , 
the MPC601 asserts DBB and drives the write data onto the data bus, out of order 
with respect to the address pipeline. The write transaction concludes with the 
MPC6()1 negating UBH. 

4. The next qualified data bus grant signals the MPC6()1 to complete the outstanding 
read transaction by latching the data on the bus. This asserdon of DBG should not 
be accompanied by an asserted DBWO. 

Any number of bus transacdons by other bus masters can be attempted between any of 
these steps. 

Note the following regarding DBWO: 

• DBWO cannot be asserted if no data bus write tenures are pending. 

• DBWO can be asserted if no data bus read is pending, but it has no effect on write 
ordering. 

• The ordering and presence of data bus writes is determined by the writes in the write 
queues at the dme Wj is asserted for the write address (not DBG). If a particular 
write is desired (for example, a cache-sector snoop push-out operation), then BU 
must be asserted after that particular write is in the queue and it must be the highest 
priority write in the queue at that time. A cache-sector snoop push-out operations 
may be the highest priority write, but more than one may be queued. 

• Because more than one write may be in the write queue when DBG is asserted for 
the write address, more than one data bus write may be enveloped by a pending data 
bus read. 

The arbiter must monitor bus operations and coordinate the various masters and slaves with 
respect to the use of the data bus when DBWO is used. Individual DBG signals associated 
with each bus device should allow the arbiter to synchronize both pipelined and split- 
transacdon bus organizations. Individual DBG signals provide a primidve form of source- 
level tagging for the granting of the data bus. 

Note that use of the DBWO signal allows some operation-level tagging with respect to the 
MPC601 and the use of the data bus. 
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Chapter 10 
Instruction Set 



This chapter describes individual instructions, including a description of instruction 
formats and notation and an alphabetical Usting of the MPC6{)rs instructions by 
mnemonic. 

10.1 Instruction Formats 

Instructions are four-bytes long and word-ahgned, so when instruction addresses are 
presented to the processor (as in branch instructions) the two low-order bits are ignored. 
Similarly, whenever the processor develops an instruction address, its two low-order bits 
are zero. 

Bits 0-5 always specify the primary opcode. Many instructions also have an secondary 
opcode. The remaining bits of the instruction contain one or more fields for the different 
instruction formats. 

Some instruction fields are reserved or must contain a predefined value as shown in the 
individual instruction layouts. If a reserved field does not have all bits set to 0, or if a field 
that must contain a particular value does not contain that value, the instruction form is 
invalid and the results are as described in Appendix D, "Classes of Instructions". 

10.1.1 Split Field Notation 

Some instruction fields occupy more than one contiguous sequence of bits or occupy a 
contiguous sequence of bits used in permuted order. Such a field is called a split field. In 
the format diagrams and in the individual instruction layouts, the name of a split field is 
shown in small letters, once for each of the contiguous sequences. In the pseudocode 
description of an instruction having a spUt field and in some places where individual bits of 
a split field are identified, the name of the field in small letters represents the concatenation 
of the sequences from left to right. Otherwise, the name of the field is capitalized and 
represents the concatenation of the sequences in some order, which need not be left to right, 
as described for each affected instruction. 
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10.1.2 Instruction Fields 

Table 10-1 describes the instruction fields used in the various instruction formats. 

Table 10-1. Instruction Formats 



Field 


Description 


AA (30) 


Absolute address bit 

The immediate field represents an address relative to the current instruction address. The 
effective address of the branch is either the sum of the LI field sign-extended to 32 bits and the 
address of the branch instruction or the sum of the BD field sign-extended to 32 bits and the 
address of the branch instruction. 

1 The immediate field represents an absolute address. The effective address of the branch is 
the LI field sign-extended to 32 bits or the BD field sign-extended to 32 bits. 


crbA(11-15) 


Field used to specify a bit in the CR to be used as a source. 


crbB (16-20) 


Field used to specify a bit in the CR to be used as a source. 


BD (1 6-29) 


Immediate field specifying a 14-bit signed two's complement branch displacement that is 
concatenated on the right with b'OO' and sign-extended to 32 bits. 


crfD (6-8) 


Field used to specify one of the CR fields or one of the FPSCR fields as a destination. 


crfS (11-13) 


Field used to specify one of the CR fields or one of the FPSCR fields as a source. 


81(11-15) 


Field used to specify a bit in the CR to be used as the condition of a branch conditional 
instruction. 


BO (6-1 0) 


Field used to specify options for the branch conditional instructions. The encoding is described in 
Section 3.7.1, "Branch Instructions". 


crbD(6-10) 


Field used to specify a bit in the CR or in the FPSCR as the destination of the result of an 
instruction. 


d(16-31) 


Immediate field specifying a 16-bit signed two's complement integer that is sign-extended to 32 
bits. 


FM (7-14) 


Field mask used to identify the FPSCR fields that are to be updated by the mtfsf instruction. 


frA(11-15) 


Field used to specify an FPR as a source of an operation. 


frB (16-20) 


Field used to specify an FPR as a source of an operation. 


frC (21-25) 


Field used to specify an FPR as a source of an operation. 


frS(6-10) 


Field used to specify an FPR as a source of an operation. 


frD(6-10) 


Field used to specify an FPR as the destination of an operation. 


CRM (12-19) 


Field mask used to identify the CR fields that are to be updated by the mtcrf instruction. 


LI (6-29) 


Immediate field specifying a 24-bit, signed two's complement integer that is concatenated on the 
right with b'OO' and sign-extended to 32 bits. 


LK(31) 


Link bit. 

Does not update the link register. 

1 Updates the link register. If the instruction is a branch instruction, the address of the instruction 
following the branch instruction is placed into the link register. 


MB (21-25) and 
ME (26-30) 


Fields used in rotate instructions to specify a 32-bit mask consisting of 1 -bits from bit MB+32 
through bit ME+32 inclusive, and 0-bits elsewhere, as described in Section 3.3.4, "Integer Rotate 
and Shift Instructions". 
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Field 


Description 


NB (16-20) 


Field used to specify tine number of bytes to move in an immediate string load or store. 


opcode (0-5) 


Primary opcode field. 


OE (21) 


Used for extended arittimetic to enable setting OV and SO in thie XER. 


rA(11-15) 


Field used to specify a GPR to be used as a source or as a destination. 


rB (16-20) 


Field used to specify a GPR to be used as a source. 


Re (31) 


Record bit 

Does not update ttie condition register. 

1 Updates the condition register (CR) to reflect the result of the operation. 

For integer instructions, CR bits 0-3 are set to reflect the result as a signed quantity. The 
result as an unsigned quantity or a bit string can be deduced from the EG bit. For floating-point 
instructions, CR bits 4-7 are set to reflect floating-point exception, floating-point enabled 
exception, floating-point invalid operation exception, and floating-point overflow exception. 


rS(6-10) 


Field used to specify a GPR to be used as a source. 


rD(6-10) 


Field used to specify a GPR to be used as a destination. 


SH (1 6-20) 


Field used to specify a shift amount. 


SIMM (16-31) 


Immediate field used to specify a 1 6-bit signed integer. 


SPR (11-20) 


Field used to specify a special purpose register for the mtspr and mfspr instructions. The 
encoding is described in Section 3.8.1 , "Move To/From Special Purpose Register Instructions". 


TO (6-10) 


Field used to specify the conditions on which to trap. The encoding is described in Section 3.7.5, 
'Trap Mnemonics". 


IMM (16-19) 


Immediate field used as the data to be placed into a field in the FPSCR. 


UIMM (16-31) 


Immediate field used to specify a 16-bit unsigned integer. 


XO (21-30, 22- 
30, 26-30, or 30) 


Secondary opcode field. 



10.1.3 Notation and Conventions 

The operation of some instructions is described by a semiformal language (pseudocode). 
See Table 10-2 for a list of pseudocode notation and conventions used throughout this 
chapter. 

Table 10-2. Pseudocode Notation and Conventions 



Notation/Convention 


Meaning 


f- 


Assignment 


- 


NOT logical operator 


* 


Multiplication 


^ 


Division (yielding quotient) 


+ 


Two's-complement addition 
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Table 10-2. Pseudocode Notation and Conventions (Continued) 



Notation/Convention 


Meaning 


- 


Two's-complement subtraction, unary minus 


=, * 


Equals and Not Equals relations 


<,<,>,> 


Signed comparison relations 


<U,>U 


Unsigned comparison relations 


? 


Unordered comparison relation 


&.I 


AND, OR logical operators 


II 


Used to describe the concatenation of two values (i.e., 010 || 111 is the same as 010111) 


®,= 


Exclusive-OR, Equivalence logical operators ((a=b) = (a®-b)) 


b 'nnnn' 


A number expressed in binary format 


X 'nnnn' 


A number expressed in hexadecimal format 


(rA|0) 


Thecontentsof rAif the rA field has the value 1-31, or the value if the rA field is 


. (period) 


As the last character of an instruction mnemonic, a period (.) means that the instruction 
updates the Condition Register field. 


CEIL(x) 


Least integer > x 


DOUBLE(x) 


Result of converting x form floating-point single format to floating-point double format. 


EXTS(x) 


Result of extending x on the left with sign bits 


GPR(x) 


General Purpose Register x 


MASK(x, y) 


Mask having 1's in positions x through y (wrapping if x > y) and O's elsewhere 


MEM(x, y) 


Contents of y bytes of memory starting at address x 


ROTL[32](x, y) 


Result of rotating the 64-bit value x||x left y positions, where x is 32 bits long 


SINGLE(x) 


Result of converting x from floating-point double format to floating-point single format. 


SPR(x) 


Special Purpose Register x 


x(n) 


X is raised to the nth power 


(n)x 


The replication of x, n times (i.e., x concatenated to itself n-^ times). (n)0 and (n)1 are 
special cases 


x[n] 


nis a bit or field within x, where x is a register 


TRAP 


Invoke the system trap handler 


undefined 


An undefined value. The value may vary from one implementation to another, and from 
one execution to another on the same implementation. 


characterization 


Reference to the setting of status bits, in a standard way that is explained in the text 


CIA 


Current Instruction Address, which is the 32-bit address of the instruction being 
described by a sequence of pseudocode. Used by relative branches to set the Next 
Instruction Address (NIA). Does not correspond to any architected register. 
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Table 10-2. Pseudocode Notation and Conventions (Continued) 



Notation/Convention 


Meaning 


NIA 


Next Instruction Address, which is the 32-bit address of the next instruction to be 
executed (the branch destination) after a successful branch. In pseudocode, a 
successful branch is indicated by assigning a value to NIA. For instructions which do not 
branch, the next instruction address is CIA +4. 


if.. .then. ..else... 


Conditional execution, indenting shows range, else is optional 


do 


Do loop, indenting shows range. "To" and/or "by" clauses specify incrementing an 
iteration variable, and "while' and/or "until" clauses give termination conditions, in the 
usual manner. 


leave 


Leave innermost do loop, or do loop described in leave statement 



Precedence rules for pseudocode operators are summarized in Table 10-3. 

Table 10-3. Precedence Rules 



Operators 


Associativity 


x[n], function evaluation 


Left to right 


(n)x or replication, 
x(n) or exponentiation 


Right to left 


unary -, -i 


Right to left 


*, - 


Left to right 


+,- 


Left to right 


II 


Left to right 


=,vi,<,<,>,>,<U,>U,? 


Left to right 


&,®,= 


Left to right 


1 


Left to right 


- (range) 


None 


<r- 


None 



Note that operators higher in Table 10-3 are applied before those lower in the table. 
Operators at the same level in the table associate from left to right, from right to left, or not 
at all, as shown. 

10.2 MPC601 Instruction Set 

The remainder of this chapter lists and describes the instruction set for the MPC601. The 
instructions are listed in alphabetical order by mnemonic and include those instructions that 
are specific to the MPC601 that are not specified as part of the PowerPC architecture. 
Figure 10-1 shows the format for each instruction description page. 
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Instruction Name 



Instruction Syntax 



Equivalent Power Mnemonics 



Instruction Encoding 



Pseudocode Description of 

Instruction Operation 

Text Description of 



Instruction Operation 
Registers Altered by Instruction 



addx 

Add 

add 
add. 
addo 
ad do. 

[P(!)WER mnemonics: cax, aix., caxo, caxo.) 





addx 




Integer Unit 


rD,rA,rB 


((JE=C) Rc=0) 


rD,rA,rB 


(OE=0Rc=l) 


rI),rA,rB 


((JE=1 Rc=0) 


rD,rA,rB 


(0E=1 Rc=l) 



I 31 I n I A I B lod 266 Ir^ 

5 6 10 11 15 16 20 21 22 30 31 



->> r[) <- (rA) + (rB) 



The sum (rA) + (rB) is placed into rl). 

(!)ther registers altered: 

• Condition Register (CRO Field): 
Affected: LT, (iT, EQ, SO 

• XER: 
Affected: SO, OV 



(ifRc=l) 
(ifOE=l) 



Figure 10-1. Instruction Description 

Note in Figure 10-1 that the execution unit that executes the instruction may not be the 
same for other PowerPC processors. 
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absx 


POWER Architecture 


Absolute 






abs 


rD,rA 


(OE=0 Rc=()) 


abs. 


rD,rA 


(OE=0Rc=l) 


absu 


rD,rA 


(0E=1 Rc={)) 


abso. 


rD,rA 


(0E=1 Rc=l) 



absx 

Integer Unit 



Hi Reserved 



31 


D 


A 


00000 


OE 


360 


Re 



5 6 10 11 15 16 20 21 22 30 31 

This instruction is not part of the PowerPC architecture. 

The absolute value l(rA)l is placed into rD. If rA contains the most negative number (i.e., 
x'8000 ()(){)()'), the result of the instruction is the most negative number and sets XER[OV] 
if overflow signaling is enabled. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: SO, OV (ifOE=l) 

Note: This instruction is specific to the MPC601. 
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addx 






Add 






add 


rD,rA,rB 


(0E=() Rc=0) 


add. 


rD,rA,rB 


(OE=()Rc=l) 


addo 


rD,rA,rB 


(0E=1 Rc=()) 


addo. 


rD,rA,rB 


(0E=1 Rc=l) 



[POWER mnemonics: cax, cax., caxo, caxo.] 



addx 

Integer Unit 



31 


D 


A 


B 


OE 


266 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD <- (rA) + (rB) 

The sum (rA) + (rB) is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 

• XER: 
Affected: SO, OV 



(ifRc=l) 
(ifOE=l) 
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addcx 






Add Carrying 






addc 


rD,rA,rB 


(OE=0 Rc=()) 


addc. 


rD,rA,rB 


(OE=0Rc=l) 


addco 


rD,rA,rB 


(0E=1 Rc=0) 


addco. 


rD,rA,rB 


(0E=1 Rc=l) 



[POWER mnemonics: a, a., ao, ao.] 



addcx 

Integer Unit 



31 


D 


A 


B 


OE 


10 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD <r- (rA) + (rB) 

The sum (rA) + (rB) is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 

• XER: 
Affected: CA 
Affected: SO, OV 



(ifRc=l) 



(ifOE=l) 
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addex 






Add Extended 






adde 


rD,rA,rB 


(OEM) Rc=()) 


adde. 


rD,rA,rB 


(OE=0Rc=l) 


addeo 


rD,rA,rB 


(0E=1 Rc=0) 


addeo. 


rD,rA,rB 


(0E=1 Rc=l) 


[POWER mnemonics: ae, ae. 


, aeo, aeo.] 



addex 

Integer Unit 



31 


D 


A 


B 


OE 


138 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD <- (rA) + (rB) + XER[CA] 

The sum (rA) + (rB) + XER[CA] is placed into rD. 

Otiier registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 
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addi 



Add Immediate 

addi rD,rA,SIMM 

[POWER mnemonic: cal] 



addi 

Integer Unit 



14 


D 


A 


SIMM 



5 6 



10 11 



15 16 



31 



if rA=0 then rD<-EXTS(SIMM) 

else rD<-(rA)+EXTS(SIMM) 

The sum (rAI 0) + SIMM is placed into rD. 

Other registers altered: 
• None 

Simplified mnemonics: 

subi rA,rB,value equivalent to 



addi rD,rA,-value 
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addic 



Add Immediate Carrying 

addic rD,rA,SIMM 

[POWER mnemonic: ai] 



addic 

Integer Unit 



12 


D 


A 


SIMM 







5 6 



10 11 



15 16 



31 



rD <- (rA) + EXTS(SIMM) 
The sum (rA) + SIMM is placed into rD. 

Other registers altered: 
• XER: 

Affected: CA 
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addic, 



Add Immediate Carrying and Record 

addic. rD,rA,SIMM 

[POWER mnemonic: ai.] 



addic. 

Integer Unit 



13 


D 


A 


SIMM 



5 6 



10 11 



15 16 



31 



rD <- (rA) + EXTS(SIMM) 

The sum (rA) + SIMM is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 

• XER: 
Affected: CA 
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addis 

Add Immediate Shifted 

addis rD,rA,SIMM 

[POWER mnemonic: cau] 



addis 

Integer Unit 



15 


D 


A 


SIMM 



5 6 



10 11 



15 16 



31 



if rA=0 then rD<-(SIMM II (16)0) 

else rD<-(rA)+(SIMM II (16)0) 

The sum (rAJ 0) + (SIMM II x'()(X)()') is placed into rD. 

Other registers altered: 
• None 
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addme 



Add to Minus One Extended 



addmex 

Integer Unit 



addme 


rD,rA 


(OE=0 Rc=()) 


addme. 


rD,rA 


(OE=0 Rc=l) 


addmeo 


rD,rA 


(0E=1 Rc=()) 


addmeo. 


rD,rA 


(0E=1 Rc=l) 


[POWER 


mnemonics: ame, ame 


, ameo, ameo.] 



m Reserved 



31 


D 


A 


OOODO 


OE 


234 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD<-(rA) + XER[CA]-l 

The sum (rA)+XER[CA]+x 'FFFFFFFF' is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 
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addze 



X 



Add to Zero Extended 



addzex 

Integer Unit 



addze 


rD,rA 


(OE=0 Rc=0) 


addze. 


rD,rA 


(OE=0Rc=l) 


addzeo 


rD,rA 


(0E=1 Rc=0) 


addzeo. 


rD,rA 


(0E=1 Rc=l) 



[POWER mnemonics: aze, aze., azeo, azeo.] 



ill Reserved 



31 


D 


A 


00000 


OE 


202 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD <- (rA) + XER[CA1 

The sum (rA)+XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 
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andx 






AND 






and 


rA,rS,rB 


(Rc=()) 


and. 


rA,rS,rB 


(Rc=l) 



andx 

Integer Unit 



31 


S 


A 


B 


28 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



rA ^ (rS) & (rB) 

The contents of rS is ANDed with the contents of rB and the result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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andcx 




AND with Complement 




andc rA,rS,rB 


(Rc=0) 


andc. rA,rS,rB 


(Rc=l) 



andcx 

Integer Unit 



31 


S 


A 


B 


60 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



rA^(rS)+-n(rB) 

The contents of rS is ANDed with the one's complement of the contents of rB and the result 
is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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andi. 

AND Immediate 

andi. rA,rS,UIMM 

[POWER mnemonic: andil.] 



andi. 

Integer Unit 



28 


S 


A 


UIMM 



5 6 



10 11 



15 16 



31 



rA<-(rS) & (x'OOOO'llUIMM) 

The contents of rS is ANDed with x'oooo* II UIMM and the result is placed into rA. 
Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



MOTOROLA 



Chapter 10. Instruction Set 



10-19 



andis. 

AND Immediate Shifted 

andis. rA,rS,UIMM 

[POWER mnemonic: andiu.] 



andis. 

Integer Unit 



29 


S 


A 


UIMM 



5 6 



10 11 



15 16 



31 



rA<-(rS)+(UIMM II x'OOOO') 

The contents of rS is ANDed with UIMM II x'OOOO' and the result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 
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bx 






bx 


Branch 






Branch Processing Unit 


b 


target_addr 


(AA=() LK={:)) 




ba 


target_addr 


(AA=1 LK=0) 




bl 


target_addr 


(AA=()LK=1) 




bla 


target_addr 


(AA=1 LK=1) 





18 


LI 


AA LK 



5 6 



29 30 31 



if AA, then NIA<-EXTS(LI II b'OO') 
else NIA<-CIA+EXTS (LI 11 b '00 ') 
if LK, then 
LR<-CIA+4 

target_addr specifies the branch target address. 

If AA=(), then the branch target address is the sum of LI II b'OO' sign-extended and the 
address of this instruction. 

If AA=1, then the branch target address is the value LI II b'OO' sign-extended. 

If LK=1, then the effective address of the instruction following the branch instruction is 
placed into the link register. 



Other registers altered: 

Affected: Link Register (LR) 



(ifLK=l) 
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bcx 

Branch Conditional 

be BO,BI,target_addr 

bca BO,BI,target_addr 

bcl BO,BI,target_addr 

bcla BO,BI,target_addr 



(AA=0 LK=()) 
(AA=1 LK=0) 
(AA=0LK=1) 
(AA=1 LK=1) 



bcx 

Branch Processing Unit 



16 


BO 


Bl 


BD 


AA LK 



5 6 



10 11 



15 16 



29 30 31 



if -nB()[21, then CTR <- CTR-1 
ctr_ok <- B0[2J I ((CTR^tO) 8 B0[31) 
cond_ok <- BO[0] I (CRIBIJ ="BU[11) 
if ctr_ok & cond_ok, then 

if AA, then NIA <- EXTS(BD II b'OO') 

else NIA <- CIA+EXTS(BD II b'OO') 

if LK, then 

LR <- CIA44 

The BI field specifies the bit in the Condition Register (CR) to be used as the condition of 
the branch. The BO field is used as described above. 

target_addr specifies the branch target address. 

If AA=0, the branch target address is the sum of BD II b'OO' sign-extended and the address 
of this instruction. 

If AA=1, the branch target address is the value BD II b'OO' sign-extended. 

If LK=1 , the effective address of the instruction following the branch instruction is placed 
into the link register. 



Other registers altered: 

Affected: Count Register (CTR) 
Affected: Link Register (LR) 

Simplified mnemonics: 

bit target equivalent to 

bne cr2,target equivalent to 

bdnz target equivalent to 



(ifBO[2]=0) 
(ifLK=l) 

be 12,0,target 
be 4,10,target 
be 16,0,target 
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bcctrx 


bcctrx 


Branch Conditional to Count Register 


Branch Processing Unit 


bcctr BO,BI (LK={)) 




bcctrl BO,BI (LK=1) 





[POWER mnemonics: bcc, bccl] 



m Reserved 



19 


BO 


Bl 


00000 


528 


LK 



5 6 



10 11 



15 16 



20 21 



30 31 



cond_ok <r- B0[()1 1 (CR[BI] = BOl 1)) 
if cond_ok then 

NIA <- CTR II b'OO' 
if LK tlien 

LR <- CIA+4 

The BI field specifies the bit in the condition register to be used as the condition of the 
branch. The BO field is used as described above, and the branch target address is 
CTR[{)-29] II b'OO'. 

If LK=1, the effective address of the instruction following the branch instruction is placed 
into the link register. 

If the "decrement and test CTR" option is specified (BO[2]=0), the instruction form is 
invalid. 

In the case of BO[2]=0 on the MPC601, the decremented count register is tested for zero 
and branches based on this test, but instruction fetching is directed to the address specified 
by the non-decremented version of the count register. The use of this invalid form of the 
bcctrx instruction is not recommended. This description is provided for informational 
purposes only. 

Other registers altered: 

Affected: Link Register (LR) (if LK=1) 

Simplified mnemonics: 
bitctr equivalent to bcctr 12,0 

bnect cr2 equivalent to bcctr 4,10 
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bclrx 




bclrx 


Branch Conditional to Link Register 




Branch Processing Unit 


bclr BO,BI 


(LK=0) 




bclrl BO,BI 


(LK=1) 




[POWER mnemonics: bcr, bcrl] 







[HI Reserved 



19 


BO 


Bl 


OOOOD 


16 


LK 



5 6 



10 11 



15 16 



20 21 



30 31 



if -,R0[2] then CTR <- CTR- 1 
ctr_ok <- B0[2] I ((CTRTtQ) e B()[31) 
cond_ok <- BO[0] I (CR[BI] s B0[ 1 )) 
if ctr_ok & cond_ok then 

NIA <- LR II b'OO' 
if LK then 

LR <r- CIA+4 

The BI field specifies the bit in the condition register to be used as the condition of the 
branch. The BO field is used as described above, and the branch target address is 
LR[()-29] II b'(X)'. 

If LK=1 then the effective address of the instruction following the branch instruction is 
placed into the link register. 

Other registers altered: 

Affected: Count Register (CTR) 

Affected: Link Register (LR) 
Simplified mnemonics: 
bltlr equivalent to 

bneir cr2 equivalent to 

bdnzir equivalent to 



(ifBO[2]=0) 


(ifLK=l) 




bclr 


12,0 


bclr 


4,10 


bclr 


16,0 
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CiCS POWER Architecture Instruction 

Cache Line Compute Size 



clcs 



rD,rA 



5 6 



10 11 



15 16 



20 21 



clcs 

Integer Unit 



HI Reserved 



31 


D 


A 


00000 


531 


Re 



30 31 



This instruction is not part of the PowerPC architecture. 

This instruction places the cache line size specitied by rA into rD, according to the 
following: 



(rA) 


Line Size Returned in rD 


OOxxx 


Undefined 


OlOxx 


Undefined 


01100 


Instruction Cache Line Size (64) 


01101 


Data Caclie Line Size (64) 


01110 


Minimum Line Size (64) 


01111 


f\^aximum Line Size (64) 


Ixxxx 


Undefined 



The value placed in rD shall be 64 for valid values of r A. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: Undefined (if Rc=l ) 
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cmp 

Compare 



cmp 



crfD,L,rA,rB 



cmp 

Integer Unit 



HI Reserved 



31 


crfD 


III 


L 


A 


B 


QOQOOOOOQO 


Hi 



5 6 8 9 10 11 



15 16 



20 21 



30 31 



a <- (rA) 

b <- (rB) 

if a< b then c^b' 100' 

else if a > b then c <- b'0 1 0' 

else c<-b'00r 

CR[4*crfU:4*crfD+3] <- c II XER[SO] 

The contents of rA is compared with the contents of rB, treating the operands as signed 
integers. The result of the comparison is placed into CR Field crfD. 

The L operand controls whether the instruction operands are treated as 64- or 32-bit 
operands, with L=() indicating 32-bit operands and L=l indicating 64-bit operands. The 
state of the L operand does not effect the operation of the MPC6()1 . 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: LT, GT, EQ, SO 
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cmpi 

Compare Immediate 
cmpi crfD,L,rA,SIMM 



cmpi 

Integer Unit 



m Reserved 



11 


crfD 


III 


L 


A 


SIMM 



5 6 8 9 10 11 



15 16 



31 



a <- (r A) 

if a<EXTS(SIMM)tlienc«-b'100' 
else if a > EXTS(SIMM) then c <- b'OlO' 
else c<-b'00r 

CR[4*crfD:4*crfD+3] <- c II XER[S01 

The contents of rA is compared with the sign-extended value of the SIMM tield, treating 
the operands as signed integers. The result of the comparison is placed into CR Field crfD. 

The L operand controls whether the instruction operands are treated as 64- or 32-bit 
operands, with L=0 indicating 32-bit operands and L=l indicating 64-bit operands. The 
state of the L operand does not effect the operation of the MPC601 . 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: LT, GT, EQ, SO 
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cmpi 

Compare Logical 

cm pi crfD,L,rA,rB 



cmpI 

Integer Unit 



PI Reserved 



31 


crfD 


III 


L 


A 


B 


32 


III 



5 6 8 9 10 11 



15 16 



20 21 



31 



a <- (rA) 

b <r- (rB) 

if a<btliencf-b'100' 

else if a > b then c <- b'OlC 

else c^b'OOr 

CR[4*crfD:4*crfD+3] <- c II XERfSO] 

The contents of rA is compared with the contents of rB, treating the operands as unsigned 
integers. The result of the comparison is placed into CR Field crfD. 

The L operand controls whether the instruction operands are treated as 64- or 32-bit 
operands, with L=0 indicating 32-bit operands and L=l indicating 64-bit operands. The 
state of the L operand does not effect the operation of the MPC6()1 . 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: LT, GT, EQ, SO 
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cmpli 

Compare Logical Immediate 
cmpli crfD,L,rA,UIMM 



cmpli 

Integer Unit 



5 6 8 9 10 11 



15 16 



m Reserved 



10 


crfD 


11 


L 


A 


UIMM 



31 



a <r- (rA) 

b <- (rB) 

if a < (x'OOOO' II UIMM) then c «- b' 100' 

else if a > (x'OOOO' II UIMM) then c <- b'0 10' 

else c<-b'00r 

CR[4*crfD:4*crfD+3] <- c II XER(SO] 

The contents of rA is compared with x'OOOO' II UIMM, treating the operands as unsigned 
integers. The result of the comparison is placed into CR Field crfD, 

The L operand controls whether the instruction operands are treated as 64- or 32-bit 
operands, with L=0 indicating 32-bit operands and L=l indicating 64-bit operands. The 
state of the L operand does not effect the operation of the MPC601. 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: LT, GT, EQ, SO 
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cntlzwx 

Count Leading Zeros Word 



cntlzw rA,rS 


(Rc=0) 


cntlzw. rA,rS 


(Rc=l) 


[POWER mnemonics: cntlz, cntlz.] 





cntlzwx 

Integer Unit 













m Reserved 


31 


S 


A 


ODOOO 


26 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



n<-0 

do while n < 32 

if rS[n]=l then leave 

n<- A7+1 
rA<— n 

A count of the number of consecutive zero bits starting at bit of rS is placed into rA. This 
number ranges from to 32, inclusive. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

For count leading zeros instructions, if Rc=l then LT is cleared to zero in the CRO field. 
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crand 

Condition Register AND 
crand crbD.crbA.crbB 



crand 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



m Reserved 



19 


crbD 


CTbA 


crbB 


257 


ill 



30 31 



CR[crbD] <r- CRlcrbA] & CR[crbB] 

The bit in the condition register specified by crbA is ANDed with the bit in the condition 
register specified by crbB. The result is placed into the condition register bit specified by 
crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by operand crbD 
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crandc 

Condition Register AND with Complement 
crandc crbD,crbA,crbB 



crandc 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



in Reserved 



19 


crbD 


crbA 


CTbB 


129 


u 



30 31 



CR[crbDl <r- CR[crbAl & -,CR[crbRI 

The bit in the condition register specified by crbA is ANDed with the complement of the 
bit in the condition register specified by crbB and the result is placed into the condition 
register bit specified by crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by operand crbD 
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creqv 

Condition Register Equivalent 
creqv crbD,crbA,crbB 



creqv 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



[if] Reserved 



19 


crbD 


CTbA 


crbB 


289 






30 31 



CR[crbD] ^ CRlcrbAl = CR(crbBl 

The bit in the condition register specitied by crbA is XORed with the bit in the condition 
register specitied by crbB and the complemented result is placed into the condition register 
bit specitied by crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specitied by operand crbD 
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crnand 



Condition Register NAND 
crnand crbD,crbA,crbB 



crnand 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



Hi Reserved 



19 


crbD 


crbA 


crbB 


225 


III 



30 31 



CR[crbD] <- -<CRlcrbAl & CR[crbB]) 

The bit in the condition register specified by crbA is ANDed with the bit in the condition 
register specified by crbB and the complemented result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by operand crbD 
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crnor 

Condition Register NOR 
crnor crbD,crbA,crbB 



crnor 

integer Unit 



5 6 



10 11 



15 16 



20 21 



ID Reserved 



19 


abD 


crbA 


CTbB 


33 


Wi 



30 31 



CR[crbDl <r- -{CRlcrbAl 1 CR[crbB]) 

The bit in the condition register specitied by crbA is ORed with the bit in the condition 
register specitied by crbB and the complemented result is placed into the condition register 
bit specitied by crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specitied by operand crbD 
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cror 

Condition Register OR 



cror crbD,crbA,crbB 



cror 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



|i| Reserved 



19 


CTbD 


crbA 


crbB 


449 


III 



30 31 



CR[crbD] <- CR[crbA] I CR[crbB] 

The bit in the condition register specified by crbA is ORed with the bit in the condition 
register specified by crbB. The result is placed into the condition register bit specified by 
crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by operand crbD 
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crorc 

Condition Register OR witti Complement 
crorc crbD,crbA,crbB 



crorc 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



in Reserved 



19 


crbD 


crbA 


crbB 


417 


:8^ 



30 31 



CR[crbD] <- CR[crbAl 1 -,CR[crbB] 

The bit in the condition register specitied by crbA is ORed with the complement of the 
condition register bit specified by crbB and the result is placed into the condition register 
bit specified by crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by operand crbD 
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crxor 

Condition Register XOR 
crxor crbD,crbA,crbB 



crxor 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



li) Reserved 



19 


crbD 


crbA 


CTbB 


193 


ill 



30 31 



CR[crbD] <- CR[crbAl © CR[crbB] 

The bit in the condition register specified by crbA is XORed with the bit in the condition 
register specified by crbB and the result is placed into the condition register specified by 
crbD. 

Other registers altered: 
• Condition Register: 

Affected: Bit specified by crbD 
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dcbf 

Data Cache Block Flush 



dcbf 



rA,rB 



dcbf 

Integer Unit 



pi] Reserved 



31 


OOOOQ 


A 


B 


86 


ill: 



5 6 



10 11 



15 16 



20 21 



30 31 



EA is the sum (rAI{))+(rB). 



The action taken depends on the memory mode associated with the target address, and on 
the state of the block. The Ust below describes the action taken for the various cases. The 
actions described will be executed regardless of whether the page or block containing the 
addressed byte is designated as write-through or if it is in caching-inhibited or caching 
allowed mode. 

• Coherency Required (WIM = XX 1) 

— Unmodified Block — Invalidates copies of the block in the caches of all 
processors. 

— Modified Block — Copies the block to memory. Invalidates copies of the block in 
the caches of all processors. 

— Absent Block — If modified copies of the block are in the caches of other 
processors, causes them to be copied to memory and invalidated. If unmodified 
copies are in the caches of other processors, causes those copies to be 
invalidated. 

• Coherency Not Required (WIM = xxO) 

— Unmodified Block — Invalidates the block in the processor's cache. 

— Modified Block — Copies the block to memory. Invalidates the block in the 
processor's cache. 

— Absent Block — Does nothing. 

This instruction operates as a load from the addressed byte with respect to address 
translation and protection. 

If EA specifies a memory address for which SR[T]=1, the instruction is treated as a no-op. 

Other registers altered: 

• None 
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dcbi 

Data Cache Block Invalidate 



dcbi 



rA,rB 



dcbi 

Integer Unit 













m Reserved 


31 


QOOOO 


A 


B 


470 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



EA is the sum (rAl{))+(rB). 



The action taken is dependent on the memory mode associated with the target, and the state 
of the block. The list below describes the action to take if the block containing the byte 
addressed by EA is or is not in the cache. The actions described must be executed regardless 
of whether the page containing the addressed byte is in caching-inhibited or caching- 
allowed mode. This is a supervisor-level instruction. 

• Coherency Required (WIM = xx 1) 

— Unmoditied Block — Invalidates copies of the block in the caches of all 
processors. 

— Moditied Block — Invalidates copies of the block in the caches of all processors. 
(Discards the modified contents.) 

— Absent Block — If copies are in the caches of any other processor, causes the 
copies to be invalidated. (Discards any moditied contents.) 

• Coherency Not Required (WIM = xxO) 

— Unmoditied Block — Invahdates the block in the local cache. 

— Moditied Block — Invalidates the block in the local cache. (Discards the moditied 
contents.) 

— Absent Block — No action is taken. 

This instruction operates as a store to the addressed byte with respect to address translation 
and protection. The reference and change bits are moditied appropriately. If EA specifies a 
memory address for which SR[T]=1, the instruction is treated as a no-op. 

Other registers altered: 

• None 
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dcbst 

Data Cache Block Store 



dcbst 



rA,rB 



dcbst 

Integer Unit 



m Reserved 



31 


OOQOO 


A 


B 


54 


III 



5 6 10 11 

EA is the sum (rAI())+(rB). 



15 16 



20 21 



30 31 



If the block containing the byte addressed by EA is in coherency required mode, and a 
block containing the byte addressed by EA is in the data cache of any processor and has 
been modified, the writing of it to main memory is initiated. 

If the block containing the byte addressed by EA is in coherency not required mode, and a 
block containing the byte addressed by EA is in the data cache of this processor and has 
been modified, the writing of it to main memory is initiated. 

The function of this instruction is independent of the write-through and caching 
inhibited/allowed modes of the page or block containing the byte addressed by EA. 

This instruction operates as a load from the addressed byte with respect to address 
translation and protection. 

If the EA specifies a memory address for an I/O controller interface segment (segment 
register T-bit=l), the dcbst instruction operates as a no-op. 

Other registers altered: 
• None 
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debt 

Data Cache Block Touch 



debt 

Integer Unit 



debt 



rA,rB 













Hi] Reserved 


31 


QOOOC 


A 


B 


278 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



EA is the sum (rAI())+(rB). 



This instruction is a hint that performance will probably be improved if the block 
containing the byte addressed by EA is fetched into the data cache, because the program 
will probably soon load from the addressed byte. Executing debt does not cause any 
exceptions to be invoked. 

This instruction operates as a load from the addressed byte with respect to address 
translation and protection except that no exception occurs in the case of a translation fault 
or protection violation. 

If the EA specifies a memory address for which SR[T]=1, the instruction is treated as a no- 
op. 

The puipose of this instruction is to allow the program to request a cache block fetch before 
it is actually needed by the program. The program can later perform loads to put data into 
registers. However, the processor is not obliged to load the addressed block into the data 
cache. If the sector is loaded, it will be either in shared state or exclusive unmodified state. 

Other registers altered: 
• None 
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dcbtst 

Data Cache Block Touch for Store 



dcbtst 



rA,rB 



dcbtst 

Integer Unit 



in Reserved 



31 


00000 


A 


B 


246 


Wl- 



5 6 



10 11 



15 16 



20 21 



30 31 



EAisthesum(rAI())+(rB). 



This instruction is a hint that performance will probably be improved if the block 
containing the byte addressed by EA is fetched into the data cache, because the program 
will probably soon store into the addressed byte. Executing dcbtst does not cause any 
exceptions to be invoked. 

This instruction operates as load from the addressed byte with respect to address translation 
and protection, except that no exception occurs in the case of a translation fault or 
protection violation. Since dcbtst does not modify memory, it is not recorded as a store (the 
change (C) bit is not modified in the page tables). 

If the EA specifies a memory address for which SR[T]=1, the instruction is treated as a no- 
op. 

The dcbtst instruction behaves exactly like the debt instruction as implemented on the 
MPC6()1. 

The purpose of this instruction is to allow the program to schedule a cache block fetch 
before it is actually needed by the program. The program can later perform stores to put 
data into memory. However the processor is not obliged to load the addressed block into 
the data cache. 

Other registers altered: 
• None 
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dcbz 



Data Cache Block Set to Zero 

dcbz rA,rB 

[POWER mnemonic: dclz] 



dcbz 

Integer Unit 



ID Reserved 



31 


00000 


A 


B 


1014 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



EA is the sum (rAI())+(rB). 



If the block containing the byte addressed by EA is in the data cache, all bytes of the block 
are cleared to zero. 

If the block containing the byte addressed by EA is not in the data cache and the 
corresponding page is caching allowed, the block is allocated in the data cache without 
fetching the block from main memory, <and all bytes of the block are set to zero. 

If the page containing the byte addressed by EA is caching inhibited or write-through, then 
the alignment exception handler is invoked and the handler should clear to zero all bytes of 
the area of memory that corresponds to the addressed block. If the block containing the byte 
addressed by EA is in coherency required mode, and the block exists in the data cache(s) 
of any other processor(s), it is kept coherent in those caches. 

This instruction is treated as a store to the addressed byte with respect to address translation 
and protection. 

If the EA specities a memory address for an I/O controller interface segment (segment 
register T-bit=l), the dcbz instruction is treated as a no-op. 

See Chapter 5, "Exceptions" for a discussion about a possible delayed machine check 
exception that can occur by use of dcbz if the operating system has set up an incorrect 
memory mapping. 

Other registers altered: 

• None 
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div 

Divide 

div 
div. 
divo 
divo. 



POWER Architecture Instruction 



rD,rA,rB 
rD,rA,rB 
rD,rA,rB 
rD,rA,rB 



(0E=() Rc=()) 
(0E=() Rc=l) 
(0E=1 Rc=()) 
(0E=1 Rc=l) 



divx 

Integer Unit 



31 


D 


A 


B 


OE 


331 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



This instruction is not part of the PowerPC architecture. 



The quotient [(rA)ll(MQ)]-^(rB) is placed into rD. The remainder is placed in the MQ 
register. The remainder has the same sign as the dividend, except that a zero quotient or a 
zero remainder is always positive. The results obey the equation: 

dividend=(divisor x quotient)+remainder 

where dividend is the original (rA)ll(MQ), divisor is the original (rB), quotient is the final 
(rD), and remainder is the final (MQ). 

If Rc=l, then CR bits LT, GT, and EQ reflect the remainder. If 0E=1, then SO and OV are 
set to one if the quotient cannot be represented in 32 bits. For the case of -2 -^ -1, the MQ 
register is cleared to zero and -2^^ is placed in rD. For all other overflows, MQ, rD and the 
CRO field are undefined (if Rc=l). 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO, OV (ifOE=l) 

Note: This instruction is specific to the MPC6()1. 
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divsx 

Divide Short 

divs 
divs. 
divso 
divso. 



POWER Architecture Instruction 



rD,rA,rB 
rD,rA,rB 
rD,rA,rB 
rD,rA,rB 



(0E=() Rc=()) 
(OE=0Rc=l) 
(0E=1 Rc=0) 
(0E=1 Rc=l) 



divsx 

Integer Unit 



31 


D 


A 


B 


OE 


363 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



This instruction is not part of the PowerPC architecture. 

The quotient (rA)4-(rB) is placed into rD. Tiie remainder is placed in the MQ register. The 
remainder has the same sign as the dividend, except that a zero quotient or a zero remainder 
is always positive. The results obey the equation: 

dividend=(divisor*quotient)+remainder 

where dividend is the original rA, divisor is the original rB, quotient is the tinal rD, and 
remainder is the final MQ. 

If Rc=l then the CR bits LT, GT, and EQ reflect the remainder. If OE=l,then SO and OV 
are set to one if the quotient cannot be represented in 32 bits (e.g., as is the case when the 
divisor is zero, or the dividend is -2^^ and the divisor is -1), the MQ register is cleared to 
zero and -2^^ is placed in rD. For all other overflows, MQ, rD and the CRO field (if Rc=l) 
are undefined. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO, O V (If 0E= 1 ) 

Note: This instruction is specific to the MPC6()1 . 
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divwx 

Divide Word 

divw 
divw. 
divvvo 
divvvo. 



rD,rA,rB 
rD,rA,rB 
rD,rA,rB 
rD,rA,rB 



(0E=() Rc=()) 
(0E=() Rc=l) 
(0E=1 Rc=0) 
(0E=1 Rc=l) 



divwx 

Integer Unit 



31 


D 


A 


B 


OE 


491 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



dividend <— (r A) 

divisor ^-(rB) 

rD <— dividend + divisor 

Register rA is the 32-bit dividend. Register rB is tiie 32-bit divisor. A 32-bit quotient is 
formed and placed into rD. The remainder is not supplied as a result. 

Both operands are interpreted as signed integers. The quotient is the unique signed integer 
that satisfies the following: 

dividend=(quotient times divisor)+r 
where 

()< r < Idivisorl 
if the dividend is non-negative, and 

-Idivisorl < r < 
if the dividend is negative. 
If an attempt is made to perform any of the divisions 

x'8()()()()()(X)7-l 

<anything> / 

then the contents of rD are undefined as are (if Rc=l) the contents of the LT, GT, and EQ 
bits of the CRO field. In these cases, if 0E=1 then OV is set to 1 . 
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Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: SO, OV (ifOE=l) 

The 32-bit signed remainder of dividing rA by rB can be computed as follows, except in 
the case that rA=-2^^ and rB=-l. 



divw 


rD,rA,rB 


# rD=quotient 


mull 


rD,rD,rB 


# rD=quotient*divisor 


subf 


rD.rD.rA 


# rD=remainder 
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divwux 

Divide Word Unsigned 



divwu 
divwu. 
divwuo 
divwuo. 



rD,rA,rB 
rD,rA,rB 
rD,rA,rB 
rD,rA,rB 



(0E=() Rc=()) 
(OE=()Rc=l) 
(0E=1 Rc=()) 
(0E=1 Rc=l) 



divwux 

integer Unit 



31 


D 


A 


B 


OE 


459 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



dividend <r- (rA) 

divisor <— (rB) 

rD <— dividend -;- divisor 

Tlie dividend is the contents of rA. The divisor is the contents of rB. A 32-bit quotient is 
formed and placed into rD. The remainder is not supplied as a result. 

Both operands are interpreted as unsigned integers. The quotient is the unique unsigned 
integer that satisfies the following: 



dividend=(quotient * divisor)+r 



where 

0< r < divisor. 

If an attempt is made to divide by zero, the contents of rD are undefined as are (if Rc=l) 
the contents of the LT, GT, and EQ bits of the CRO field. In this case, if 0E=1 then OV is 
set to 1 . 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO, OV (ifOE=l) 

The 32-bit signed remainder of dividing rA by rB can be computed as follows, except in 
the case that rA=-2^^ and rB=-l. 



divwu 

mull 

subf 



rD,rA,rB 
rD,rD,rB 
rD,rD,rA 



# rD=quotient 

# rD=quotient* divisor 

# rD=remainder 
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dOZx POWER Architecture 


Difference or Zero 




doz rD,rA,rB 


(OE=0 Rc=0) 


doz. rD,rA,rB 


(OE=()Rc=l) 


dozo rD,rA,rB 


(0E=1 Rc=0) 


dozo. rD,rA,rB 


(0E=1 Rc=l) 



dozx 

Integer Unit 



31 


D 


A 


B 


OE 


264 


Re 



5 6 10 11 15 16 20 21 22 

This instruction is not part of the PowerPC architecture. 



30 31 



The sum -i(rA)+(rB) +1 is placed into rD. If tiie value in rA is algebraically greater than 
the value in rB, rD is set to zero. If Rc=l , the CRO field is set to reflect the result placed in 
rD (i.e., if rD is set to zero, EQ is set to 1). If 0E=1, OV can only be set on positive 
overflows. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO, OV (if 0E=1) 

Note: This instruction is specific to the MPC601. 
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dOZi POWER Architecture Instruction dOZi 

Difference or Zero Immediate Integer Unit 

dozl rD,rA,SIMM 



9 


D 


A 


SIMM 



5 6 10 11 15 16 31 

This instruction is not part of the PowerPC architecture. 

The sum -n(rA)+SIMM+l is placed into rD. If the value in rA is algebraically greater than 
the value of the SIMM field, rD is set to zero. 

Other registers altered: 
• None 

Note: This instruction is specific to the MPC6()1. 
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eciwx 

External Control Input Word Indexed 



ecivvx 



rD,rA,rB 



eciwx 

Integer Unit 
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20 21 



HI Reserved 
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if rA=0 then b <- 
else b <— (rA) 

EA <- b+(rB) 
ifEAR[E]=lthen 

paddr <— address translation of EA 

send load request for paddr to device identified by EAR[RID] 

rD <— word from device 
else 

DSISR[11]<-1 

generate data access exception 

EA is the sum (rAI())+(rB). 

If EAR[E]=1, a load request for the physical address corresponding to EA is sent to the 
device identified by EAR[RID], bypassing the cache. The word returned by the device is 
placed in rD. The EA sent to the device must be word aligned. 

If EAR[E]=0, a data access exception is taken, with bit 1 1 of DSISR set to 1 . 

The eciwx instruction is supported for effective addresses that reference ordinary 
(SR[T]=0) segments, for EAs mapped by the BAT registers, and for EAs generated when 
MSR[DT]=0 (direct translation). The instruction is treated as a no-op for EAs that 
correspond to I/O controller interface (SR[T]=1) segments. 

The access caused by this instruction is treated as a load from the location addressed by EA 
with respect to protection and reference and change recording. 

This instruction is defined as an optional instruction by the PowerPC architecture, and may 
not be available in all PowerPC implementations. 

Other registers altered: 
• None 
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ecowx ecowx 

External Control Output Word Indexed Integer Unit 

ecowx rS,rA,rB 

llij Reserved 



31 


S 


A 


B 


438 


ill 



5 6 10 11 15 16 20 21 30 31 

if rA=0 then b <- 
else b <- (rA) 

EA <- b+(rB) 
ifEAR[El=lthen 

patldr <— address translation of EA 

send store request for paddr to device identified by EAR[RID] 

send rS to device 
else 

DSISR[11]<- 1 

generate data access exception 

EA is the sum (rAIO)+(rB). 

If EAR[E]=1 , a store request for the physical address corresponding to EA and the contents 
of rS are sent to the device identitied by EAR[RID], bypassing the cache. The EA sent to 
the device must be word aligned. 

If EAR[E]=0, a data access exception is taken, with bit 1 1 of DSISR set to 1 . 

The ecowx instruction is supported for effective addresses that reference ordinary 
(SR[T]=0) segments, for EAs mapped by the BAT registers, and for EAs generated when 
MSR[DT]=() (direct translation). The instruction is treated as a no-op for EAs that 
correspond to I/O controller interface (SR[T]=1) segments. The access caused by this 
instruction is treated as a store to the location addressed by EA with respect to protection 
and reference and change recording. 

This instruction is detined as an optional instruction by the PowerPC architecture, and may 
not be available in all PowerPC implementations. 

Other registers altered: 
• None 
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eieio 

Enforce In-Order Execution of I/O 



eieio 

Integer Unit 

m Reserved 
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The eieio instruction provides an ordering function for the effects of load and store 
instructions executed by a given processor. Executing an eieio instruction ensures that all 
memory accesses previously initiated by the given processor are complete with respect to 
main memory before any memory accesses subsequently initiated by the given processor 
access main memory. 

The synchronize (sync) and the enforce in-order execution of I/O (eieio) instructions are 
handled in the same manner internally to the MPC601 . These instructions delay execution 
of subsequent instructions until all previous instructions have completed to the point that 
they can no longer cause an exception, all previous memory accesses are performed 
globally, and the sync or eieio operation is broadcast onto the MPC6()1 bus interface. 

eieio orders loads/stores to caching inhibited memory and stores to write-through required 
memory. 

Other registers altered: 
• None 

The eieio instruction is intended for use only in performing memory-mapped I/O 
operations and to prevent load/store combining operations in main memory. It can be 
thought of as placing a barrier into the stream of memory accesses issued by a processor, 
such that any given memory access appears to be on the same side of the barrier to both the 
processor and the I/O device. 

The eieio instruction may complete before previously initiated memory accesses have been 
performed with respect to other processors and mechanisms. 
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eqvx 

Equivalent 



eqvx 

Integer Unit 



eqv 
eqv. 



rA,rS,rB 

rA,rS,rB 



(Rc=0) 
(Rc=l) 
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rA^-,((rS)s(rB)) 

The contents of rS is XORecI with the contents of rB and the complemented result is placed 
into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 
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extsbx 






Extend Sign Byte 






extsb 


rA,rS 


(Rc=()) 


extsb. 


rA,rS 


(Rc=l) 
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S^rS[241 

r A[ 24-3 ll<-rS[ 24-31] 

rA[0-23] <- (24)S 

The contents of rS[24-3 1 ] are placed into r A[24-3 1 ]. Bit 24 of rS is placed into r A[()-23]. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 
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extshx 

Extend Sign Half Word 



extshx 

Integer Unit 



extsh rA,rS 


(Rc=()) 


extsh. rA,rS 


(Rc=l) 


[POWER mnemonics: exts, exts.] 
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S<-rS[161 

r Al 1 6-3 ll<-rS[ 16-31] 

rAlO-151<-(16)S 

The contents of rS[ 1 6-3 1 ] are placed into r A[ 1 6-3 1 ]. Bit 1 6 of rS is placed into r A[()-l 5], 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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fabsx 

Floating-Point Absolute Value 



fabs 
fabs. 



frD,frB 
frD,frB 



(Rc=0) 
(Rc=l) 



5 6 10 11 15 16 20 21 

The contents of frB with bit cleared to zero is placed into frD. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 



fabsx 

Floating-Point Unit 



HI Reserved 
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faddx 

Floating-Point Add (Single-Precision) 



fadd frD,frA,frB 


(Rc=()) 


fadd. frD,frA,frB 


(Rc=l) 


[POWER mnemonics: fa, fa.] 





faddx 

Floating-Point Unit 
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20 21 



25 26 



m Reserved 



63 


D 


A 


B 


ooaoo 


21 


Re 



30 31 



fadds 
fadds. 



frD,frA,frB 
frD,frA,frB 



(Rc=()) 
(Rc=l) 



lij Reserved 
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The floating-point operand in frA is added to the floating-point operand in frB. If the most 
signiticant bit of the resultant signiticand is not a one, the result is normalized. The result 
is rounded to the target precision under control of the floating-point rounding control field 
RN of the FPSCR and placed into frD. 

Floating-point addition is based on exponent comparison and addition of the two 
signiticands. The exponents of the two operands are compared, and the signiticand 
accompanying the smaller exponent is shifted right, with its exponent increased by one for 
each bit shifted, until the two exponents are equal. The two significands are then added 
algebraically to form an intermediate sum. All 53 bits in the signiticand as well as all three 
guard bits (G, R, and X) enter into the computation. 

If a carry occurs, the sum's signiticand is shifted right one bit position and the exponent is 
increased by one. FPSCR[FPRF] is set to the class and sign of the result, except for invalid 
operation exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, Fl, FX, OX, UX, XX,VXSNAN, VXISI 
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fcmpo 

Floating-Point Compare Ordered 



fcmpo 



crfD,frA,frB 



fcmpo 

Floating-Point Unit 



(m Reserved 
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The floating-point operand in frA is compared to the floating-point operand in frB. The 
result of the compare is placed into CR Field crfD and the FPCC. 

If one of the operands is a NaN, either quiet or signaling, then CR Field crfD and the FPCC 
are set to reflect unordered. If one of the operands is a signaling NaN, then VXSNAN is set, 
and if invalid operation is disabled (VE=0) then VXVC is set. Otherwise, if one of the 
operands is a QNaN then VXVC is set. 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: FPCC, FX, VXSNAN, VXVC 
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fcmpu 

Floating-Point Compare Unordered 



fcmpu 

Floating-Point Unit 



fcmpu 



crfD,frA,frB 



pi] Reserved 
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The floating-point operand in register frA is compared to the floating-point operand in 
register frB. The result of the compare is placed into CR Field crfD and the FPCC. 

If one of the operands is a NaN, either quiet or signaling, then CR Field crfD and the FPCC 
are set to reflect unordered. If one of the operands is a signaling NaN, then VXSN AN is set. 

Other registers altered: 

• Condition Register (CR Field specified by operand crfD): 
Affected: FPCC, FX, VXSNAN 
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fctiwx 






Floating-Point Convert to Integer Word 




fctiw 


frD,frB 


(Rc=()) 


fctiw. 


frD,frB 


(Rc=l) 



fctiwx 

Floating-Point Unit 



ID Reserved 
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The floating-point operand in register frB is converted to a 32-bit signed integer, using the 
rounding mode specified by FPSCR[RN], and placed in bits 32-63 of frD. Bits 0-3 1 of frD 
are undefined. 

If the contents of frB is greater than 2^^-l , bits 32-63 of frD are set to x 7FFF_FFFF '. 

If the contents of frB is less than -2^\ bits 32-63 of frD are set to x '8(X)0_()000 '. 

The conversion is described fully in Appendix F.2, "Conversion from Floating-Point 
Number to Unsigned Fixed-Point Integer Word." 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined. 
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result 
is inexact. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc= 1 ) 

• Floating-point Status and Control Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI 
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fctiwzx 

Floating-Point Convert to Integer Word with Round toward Zero 



fctiwzx 

Floating-Point Unit 



fctiwz 
fctiwz. 



frD,frB 
frD,frB 



(Rc=()) 
(Rc=l) 



HI Reserved 
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The floating-point operand in register frB is converted to a 32-bit signed integer, using the 
rounding mode round toward zero, and placed in bits 32-63 of frD. Bits ()-31 of frD are 
undefined. 

If the operand in frB is greater than 2^^-l, bits 32-63 of frD are set to x 7FFF_FFFF '. 

If the operand in frB is less than -2^\ bits 32-63 of frD are set to x '8()()0_()000 '. 

The conversion is described fully in Appendix F,2, "Conversion from Floating-Point 
Number to Unsigned Fixed-Point Integer Word." 

Except for trap-enabled invalid operation exceptions, FPSCR[FPRF] is undefined. 
FPSCR[FR] is set if the result is incremented when rounded. FPSCR[FI] is set if the result 
is inexact. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN, VXCVI 
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fdivx 



Floating-Point Divide (Single-Precision) 



fdiv 
fdiv. 



frD,frA,frB 
frD,frA,frB 



(Rc=()) 
(Rc=l) 



[POWER mnemonics: fd, fd.] 



fdivx 

Floating-Point Unit 
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fdivs 
fdivs. 



frD,frA,frB 
frD,frA,frB 



(Rc=0) 
(Rc=l) 



liij Reserved 
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The floating-point operand in register frA is divided by the floating-point operand in 
register frB. No remainder is preserved. 

If an operand is a denormalized number then it is prenormalized before the operation is 
started. If the most significant bit of the resultant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

Floating-point division is based on exponent subtraction and division of the significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1 and zero divide exceptions when FPSCR[ZE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, ZX, XX, VXSNAN, VXIDI, VXZDZ 
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fmaddx 

Floating-Point Multiply-Add (Single-Precision) 

fmadd frD,frA,frC,frB (Rc=()) 

fmadd. frD,frA,frC,frB (Rc=l) 

[POWER mnemonics: fma, fma.] 



fmaddx 

Floating-Point Unit 



63 


D 


A 


B 


c 


29 


Re 





5 6 10 11 




15 16 


20 21 




25 26 




30 31 


fmadds 
fmadds. 


frD,frA,frC,frB 
frD,frA,frC,frB 




(Rc=()) 
(Rc=l) 
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The following operation is performed: 
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frD <- [(frA)*(frC)]+(frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 

If an operand is a denormalized number then it is prenormalized before the operation is 
started. If the most significant bit of the resultant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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fmrx 




Floating-Point Move Register 




fmr frD,frB 


(Rc=0) 


fmr. frD,frB 


(Rc=l) 



fmrx 

Floating-Point Unit 
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The contents of register frB is placed into frD. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc= 1 ) 



20 21 



30 31 
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fmsubx 

Floating-Point Multiply-Subtract (Single-Precision) 

fmsub frD,frA,frC,frB (Rc=()) 

fmsub. frD,frA,frC,frB (Rc=l) 

[POWER mnemonics: fms, fms.] 



fmsubx 

Floating-Point Unit 
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fmsubs 


frD,frA,frC,frB 




(Rc=0) 












fmsubs. 


frD,frA,frC,frB 




(Rc=l) 
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The following operation is performed: 
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frD ^ [(frA)*(frC)l - (frB) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If an operand is a denormalized number then it is prenormalized before the operation is 
started. If the most significant bit of the resultant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISl, VXIMZ 
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fmulx 

Floating-Point Multiply (Single-Precision) 



fmulx 

Floating-Point Unit 



fmul 
fmul. 



frD,frA,frC 
frD,frA,frC 



(Rc=()) 
(Rc=l) 



[POWER mnemonics: fm, fm.] 
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fmuls 
fmuls. 


frD,frA,frC 
frD,frA,frC 




(Rc=()) 
(Rc=l) 
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The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. 

If an operand is a denormalized number then it is prenormalized before the operation is 
started. If the most significant bit of the resultant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

Floating-point multiplication is based on exponent addition and multiplication of the 
significands. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXIMZ 
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fnabsx 

Floating-Point Negative Absolute Value 



fnabsx 

Floating-Point Unit 



fnabs 
fnabs. 



frD,frB 
frD,frB 



(Rc=()) 
(Rc=l) 
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The contents of register frB with bit set to one is placed into frD. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l) 
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fnegx 

Floating-Point Negate 



fneg 
fneg. 



frDjfrB 
frD,frB 



(Rc=0) 
(Rc=l) 



fnegx 

Floating-Point Unit 
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The contents of register frB with bit inverted is placed into frD. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 



30 31 
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fnmaddx 



Floating-Point Negative Multiply-Add (Single-Precision) 

fnmadd frD,frA,frC,frB (Rc=0) 

fnmadd. frD,frA,frC,frB (Rc=l) 

[POWER mnemonics: fnma, fnma.] 



fnmaddx 

Floating-Point Unit 
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fnmadds frD,frA,frC,frB 
fnmadds. frD,frA,frC,frB 




(Rc=0) 
(Rc=l) 
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The following operation is performed: 
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frD <- -([(frA)*(frC)]+(frB)) 

The floating-point operand in register frA is multiplied by the floating-point operand in 
register frC. The floating-point operand in register frB is added to this intermediate result. 
If an operand is a denormalized number then it is prenormalized before the operation is 
started. If the most significant bit of the resultant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result as would be obtained by using the floating-point 
multiply-add instruction and then negating the result, with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception 
have a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 
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Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISl, VXIMZ 
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fnmsubx 

Floating-Point Negative Multiply-Subtract (Single-Precision) 

fnmsub frD,frA,frC,frB (Rc=()) 

fnmsub. frD,frA,frC,frB (Rc=l) 

[POWER mnemonics: fnms, fnms. 
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Floating-Point Unit 
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fnmsubs frD,frA,frC,frB 
fnmsubs. frD,frA,frC>frB 




(Rc=()) 
(Rc=l) 
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The following operation is performed: 
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tVD <r- -([(frA)*(frC)l - (frB)) 

The floating-point operand in register frA is multiphed by the floating-point operand in 
register frC. The floating-point operand in register frB is subtracted from this intermediate 
result. 

If an operand is a denormalized number, it is prenormalized before the operation is started. 
If the most significant bit of the resultant significand is not one, the result is normalized. 
The result is rounded to the target precision under control of the floating-point rounding 
control field RN of the FPSCR, then negated and placed into frD. 

This instruction produces the same result obtained by negating the result of a floating 
multiply-subtract instrucfion with the following exceptions: 

• QNaNs propagate with no effect on their sign bit. 

• QNaNs that are generated as the result of a disabled invalid operation exception 
have a sign bit of zero. 

• SNaNs that are converted to QNaNs as the result of a disabled invalid operation 
exception retain the sign bit of the SNaN. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 
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Other registers altered: 

• Condition Register (CRl Field) 

Affected: FX, FEX, VX, OX (if Rc=l) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN, VXISI, VXIMZ 
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frspx 

Floating-Point Round to Single-Precision 



frsp 
frsp. 



frD,frB 
frD,frB 



(Rc=()) 
(Rc=l) 



frspx 

Floating-Point Unit 
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If it is already in single-precision range, the floating-point operand in register frB is placed 
into frD. Otherwise the floating-point operand in register frB is rounded to single-precision 
using the rounding mode specified by FPSCR[RN] and placed into frD. 

The rounding is described fully in Appendix F.l , "Conversion from Floating-Point Number 
to Signed Fixed-Point Integer Word." 

FPSCRIFPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE1=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 
Affected: FPRF, FR, FI, FX, OX, UX, XX, VXSNAN 
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fsubx 

Floating-Point Subtract (Single-Precision) 



fsub frD,frA,frB 

fsub. frD,frA,frB 

[POWER mnemonics: fs, fs. 
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The floating-point operand in register frB is subtracted from the floating-point operand in 
register frA. If the most significant bit of the resuhant significand is not a one the result is 
normalized. The result is rounded to the target precision under control of the floating-point 
rounding control field RN of the FPSCR and placed into frD. 

The execution of the floating-point subtract instruction is identical to that of floating-point 
add, except that the contents of frB participates in the operation with its sign bit (bit 0) 
inverted. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l) 

• Floating-point Status and Control Register: 

Affected: FPRF, FR, Fl, FX, OX, UX, XX, VXSNAN, VXISl 
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icbi 

Instruction Cache Block Invalidate 
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EA is the sum (rAIO)+(rB) 



In other PowerPC processors, if the block containing the byte addressed by EA is in 
coherency required mode, and a block containing the byte addressed by EA is in the 
instruction cache of any processor, the block is made invalid in all such processors, so that 
subsequent references cause the block to be refetched. 

Also, if the block containing the byte addressed by EA is in coherency not required mode, 
and a block containing the byte addressed by EA is in the instruction cache of this 
processor, the block is made invalid in this processor, so that subsequent references cause 
the block to be fetched from main memory (or perhaps from a data cache). 

The MPC6()1 treats the icbi instruction as a no-op, even to the extent of not validating the 
EA. 

Other registers altered: 
• None 
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isync 

Instruction Synchronize 

isync 

[POWER mnemonic: ics] 
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This instruction waits for all previous instructions to complete and then discards any 
prefetched instructions, causing subsequent instructions to be fetched (or refetched) from 
memory and to execute in the context established by the previous instructions. This 
instruction has no effect on other processors or on their caches. 

This instruction is context synchronizing. 

Other registers altered: 
• None 
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Ibz 

Load Byte and Zero 



Ibz 



rD,cl(rA) 
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if rA=0 then b <- 
else b <- (rA) 

EA <- b+EXTS(d) 
rD <r- (24)0 II MEM(EA, 1) 

The effective address is the sum (rAlO) + d. The byte in memory addressed by EA is loaded 
into rD[24-31]. Bits rD[()-23] are cleared to 0. 

Other registers altered: 
• None 
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Ibzu 

Load Byte and Zero with Update 
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rD,d(rA) 



Ibzu 

Integer Unit 



35 


D 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=0 then b <- 
else b <r- (rA) 

EA <r- b+EXTS(d) 
rD<-(24)0 II MEM(EA. 1) 
rA<-EA 

EA is the sum (rAlO) + d. The byte in memory addressed by EA is loaded into rD[24-3 1 ]. 
Bits rD[0-23] are cleared to 0. 

EA is placed into rA. 

If operand rA=0 the MPC6()1 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA=() or rA=rD as invalid forms 

Other registers altered: 

• None 
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Ibzux 



Load Byte and Zero with Update Indexed 
Ibzux rD,rA,rB 
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if rA=() then b <- 

else b <- (rA) 

EA <- b+(rB) 

rD <- (24)0 II MEM(EA, 1) 

rA<-EA 

EA is the sum (rAlO) + (rB). The byte addressed by EA is loaded into rD[24-31]. Bits 
rD[()-23] are set to 0. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA=() or rA=rD as invalid forms 

Other registers altered: 

• None 
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Ibzx 

Load Byte and Zero Indexed 



Ibzx 



rD,rA,rB 
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if rA=() then b <- 
else b <— (rA) 

EA <- b+(rB) 
rD <- (24)0 II MEM(EA, 1) 

EA is the sum (rAlO) + (rB). The byte in memory addressed by EA is loaded into 
rD[24-31]. 

Bits rD[0-23] are set to 0. 

Other registers altered: 
• None 
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Ifd 

Load Floating-Point Double-Precision 
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frD,d(rA) 
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if rA=0 then b <- 
else b <- (rA) 

EA<-b+EXTS(d) 
frD <- MEM(EA, 8) 
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EA is tile sum (rAlO) + d. 

Tlie double word in memory addressed by EA is placed into frD. 

Other registers altered: 
• None 
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Ifdu 

Load Floating-Point Double-Precision with Update 



Ifdu 
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if rA=() then b <- 
else b<-(rA) 

EA <- b+EXTS(d) 
frD 4- MEM(EA, 8) 
rA<-EA 
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EA is the sum (rAlO) + d. 

The double word in memory addressed by EA is placed into frD. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO. The PowerPC architecture 
defines load with update instructions with operand rA=() as an invalid form. 

Other registers altered: 

• None 
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Ifdux 

Load Floating-Point Double-Precision with Update Indexed 
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if rA=0 then b <- 
else b <— (rA) 

EA <- b+(rB) 
frD <- MEM(EA, K) 
rA<-EA 

EA is the sum (rAlO) + (rB). 

The double word in memory addressed by EA is placed into frD. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO. The PowerPC architecture 
defines load with update instructions with operand rA=0 as an invalid form. 

Other registers altered: 

• None 
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Ifdx 

Load Floating-Point Double-Precision Indexed 



Ifdx 



frD,rA,rB 
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if rA=0 then b <- 
else b <r- (rA) 

EA <- b+(rB) 
frD <- MEM(EA, 8) 
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EA is the sum (rAlO) + (rB). 

The double word in memory addressed by EA is placed into frD. 

Other registers altered: 
• None 
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Ifs 

Load Floating-Point Single-Precision 



Ifs 



frD,d(rA) 
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if rA=0 then b <- 

else b <- (rA) 

EA 4- b+EXTS(d) 

frD <- D()UBLE(MEM(EA. 4)) 

EA is the sum (rAlO) + d. 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision (see Section 3.6.9. 1 , 
"Double-Precision Conversion for Floating-Point Load Instructions") and placed into frD. 

Other registers altered: 
• None 
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Ifsu 

Load Floating-Point Single-Precision with Update 
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if rA=() then b <- 

else b <- (rA) 

EA <- b+EXTS(d) 

frD ^ D0UBLE(MEM(EA,4)) 

rA<-EA 

EA is the sum (rAlO) -f- d. 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision (see Section 3.6.9.1, 
"Double-Precision Conversion for Floating-Point Load Instructions") and placed into frD. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO. The PowerPC architecture 
defines load with update instructions with operand rA=() as an invalid form. 

Other registers altered: 

• None 
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Ifsux 

Load Floating-Point Single-Precision with Update Indexed 
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if r A=(:) tlieii b <- 

else b <— (rA) 

EA <r- b+(rB) 

frD <r- DOUBLE(MEM(EA, 4)) 

rA^EA 

EA is the sum (rAlO) -f- (rB). 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision (see Section 3.6.9.1, 
"Double-Precision Conversion for Roating-Point Load Instructions") and placed into frD. 

EA is placed into r A. 

If operand rA=() the MPC601 does not update register rO. The PowerPC architecture 
defines load with update instructions with operand rA=0 as an invalid form. 

Other registers altered: 

• None 
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Ifsx 

Load Floating-Point Single-Precision Indexed 



Ifsx 

Integer Unit and 
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if rA=0 tlien b ^ 

else b <— (rA) 

EA f- b+(rB) 

frD <- D()UBLE(MEM(EA, 4)) 

EA is the sum (rAlO) + (rB). 

The word in memory addressed by EA is interpreted as a floating-point single-precision 
operand. This word is converted to floating-point double-precision (see Section 3.6.9.1, 
"Double-Precision Conversion for Floating-Point Load Instructions") and placed into frD. 

Other registers altered: 
• None 
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Iha 

Load Half Word Algebraic 
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if r A=0 then b <- 
else b <— (r A) 

EA <- b+EXTS(d) 
rD <- EXTS(MEM(EA, 2)) 

EA is the sum (rAlO) + d. The half word in memory addressed by EA is loaded into 
rD[16-31]. Bits rD[()-15] are filled with a copy of bit of the loaded half word. 

Other registers altered: 
• None 
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Ihau 

Load Half Word Algebraic with Update 



Ihau 



rD,d(rA) 



Ihau 

Integer Unit 



43 


D 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=0 then b <- 

else b <- (rA) 

EA<-b+EXTS(d) 

rD <r- EXTS(MEM(EA, 2)) 

rA4-EA 

EA is the sum (rAlO) + d. The half word in memory addressed by EA is loaded into rD[l 6- 
31]. 

Bits rD[0-15] are tilled with a copy of bit of the loaded half word. 

EA is placed into rA. 

If operand rA-0 the MPC601 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA=0 or rA=rD as invalid forms 

Other registers altered: 

• None 
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Ihaux 

Load Half Word Algebraic with Update Indexed 



Ihaux 



rD,rA,rB 
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if rA={) tlieii b <- C) 

else b <— (rA) 

EA<-b+(rB) 

rU <- EXTS(MEM(EA. 2)) 

rA^EA 

EA is the sum (rAlO) + (rB). The half word in memory addressed by EA is loaded into 
rD[ 16-31]. Bits rD[()-15] are tilled with a copy of bit of the loaded half word. 

EA is placed into rA. 

If operand rA=() the MPC601 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA={) or rA=rD as invalid forms 

Other registers altered: 

• None 
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Load Half Word Algebraic Indexed 
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if rA=0 then b f- 

else b ir- (rA) 

EA <r- b+(rB) 

rD <- EXTS(MEM(EA. 2)) 

EA is the sum (rAlO) + (rB). The half word in memory addressed by EA is loaded into 
rD[ 16-31]. Bits rD[0-15] are filled with a copy of bit of the loaded half word. 

Other registers altered: 
• None 
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Ihbrx 



Load Half Word Byte-Reverse Indexed 
Ihbrx rD,rA,rB 
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Integer Unit 
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if rA=() then b <- 

else b <r- (rA) 

EA <- b+(rB) 

rD <- (16)0 I! MEM(EA+1, 1) II MEM(EA,1) 

EA is the sum (rAlO) + (rB). Bits 0-7 of the half word in memory addressed by EA are 
loaded into rD[24-31]. Bits 8-15 of the half word in memory addressed by EAare loaded 
into rD[ 16-23]. Bits rD[()-15] are cleared to 0. 

The PowerPC architecture cautions programmers that some implementations of the 
architecture may run the Ihbrx instructions with greater latency than other types of load 
instructions. This is not the case in the MPC6()1. This instruction operates with the same 
latency as other load instructions. 

Other registers altered: 
• None 
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Load Half Word and Zero 
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if rA=0 then b<-0 
else b ^ rA 

EA 4- b+EXTS(d) 
rD<-0(I6)0IIMEM(EA,2) 

EA is the sum (r AlO) + d. The half word in memory addressed by EA is loaded into rD[ 1 6- 
3 1 ]. Bits rD[()-l 5] are cleared to 0. 

Other registers altered: 
• None 
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Ihzu 

Load Half Word and Zero with Update 



Ihzu 

Integer Unit 
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if rA=0 then b <- 
else b <r- (rA) 

EA <- b+EXTS(d) 
rD^(16)()IIMEM(EA,2) 
rA<-EA 

EA is the sum (r AlO) + d. The half word in memory addressed by EA is loaded into rD[ 1 6- 
31]. Bits rD[0-15] are cleared to 0. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA=() or rA=rD as invalid forms 

Other registers altered: 

• None 
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Load Half Word and Zero with Update Indexed 
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if rA=0 then b <- 
else b <- (rA) 

EA <- b+(rB) 
rDf-(16)0 II MEM(EA, 2) 
rA^EA 

EA is the sum (rAlO) + (rB). The half word in memory addressed by EA is loaded into 
rD116-31]. Bits rD[()-15] are cleared to 0. 

EA is placed into rA, 

If operand rA=0 the MPC601 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA={) or rA=rD as invalid forms 

Other registers altered: 

• None 
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Ihzx 

Load Half Word and Zero Indexed 



Ihzx 



rD,rA,rB 



Ihzx 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



[ill Reserved 



31 


D 


A 


B 


279 


ijlj 



30 31 



if rA=0 then b<-() 
else b<— rA 

EA<-l>+rB 
rD<-(16)0 II MEM(EA, 2) 

The effective address is the sum (rAlO) + (rB), The half word in memory addressed by EA 
is loaded into rD[ 16-31]. Bits rD[()-15] are cleared to 0. 

Other registers altered: 
• None 
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Imw 

Load Multiple Word 

Imw rD,d(rA) 

[POWER mnemonic: Im] 



Imw 

Integer Unit 



46 


D 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=0 then b<-0 

else b<— rA 

EA<-b+EXTS(d) 

r<-rD 

do while r < 3 1 

GPR(r)^ MEM(EA, 4) 

r<-r+l 

EA^EA-Ht 

EA is the sum (rAlO) + d. 

«=(32-D). 

n consecutive words starting at EA are loaded into the 32 bits of GPRs rD through r31. EA 
must be a multiple of 4; otherwise, the system ahgnment exception handler is invoked if 
the load crosses a page boundary. 

If rA is in the range of registers specified to be loaded, it will be skipped in the load process. 
If operand rA=0, the register is not considered as used for addressing, and will be loaded. 

Other registers altered: 
• None 
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ISCbXx POWER Architecture Instruction 

Load String and Compare Byte Indexed 



Iscbx 
Iscbx. 



rD,rA,rB 
rD,rA,rB 



(Rc=()) 
(Rc=()) 



Iscbxx 

Integer Unit 



31 


D 


A 


B 


277 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



Tills instruction is not part of the PowerPC architecture. 

EA is the sum (rAlO) + (rB). XER[25-31] contains the byte count. Register rD is the 
starting register. 

/7=XER[25-31], which is the number of bytes to be loaded. nr=(n/4), which is the number 
of registers to receive data. 

Starting with the leftmost byte in rD, consecutive bytes in memory addressed by the EA 
are loaded into rD through rD+w-1, wrapping around back through GPR if required, 
until either a byte match is found with XER[ 16-23] or n bytes have been loaded. If a byte 
match is found, that byte is also loaded. 

Bytes are always loaded left to right in the register. In the case when a match was found 
before n bytes were loaded, the contents of the rightmost byte(s) not loaded of that register 
and the contents of all succeeding registers up to and including rD+m--l are undefined. 
Also, no reference is made to memory after the matched byte is found. In the case when a 
match was not found, the contents of the rightmost byte(s) not loaded of rD+nr-\ is 
undefined. 

When XER[25-31]=(), the content of rD is undefined. 

The count of the number of bytes loaded up to and including the matched byte, if a match 
was found, is placed in XER[25-31]. 

If rA and rB are in the range of registers specified to be loaded, it will be skipped in the 
load process. If operand rA=(), the register is not considered as used for addressing, and 
will be loaded. 

Other registers affected: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: XER[25-31]=# of bytes loaded 
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Note: If Rc=l and XER[25-31]=() then the CRO field is undefined. If Rc=l and 
XER[25-31]^0 then the CRO field is set as follows: 

LT, GT, EQ, SO =b'()()' II match II XER(SO) 
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Iswi 



Load String Word Immediate 

Iswi rD,rA,NB 

[POWER mnemonic: Isi] 



iswi 

Integer Unit 



m Reserved 



31 


D 


A 


NB 


597 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



if rA=0 then EA<-0 

else EA^rA 

ifNB=0thenn<-32 

else n<-NB 

r<-rD - 1 

1^32 

do while n > 

if i=32then 

r<-r+l (mod 32) 

GPR(r)<-0 

GPR(r)[i-i+7]<-MEM(EA, i) 
i«-i+8 

The EA is (rA 1 0). 

Let n=NB if NB^iO, n=32 if NB=(); n is the number of bytes to load. Let nr=CEIL(n/4); nr 
is the number of registers to be loaded with data. 

n consecutive bytes starting at the EA are loaded into GPRs rD through rD+wr-1. Bytes are 
loaded left to right in each register. The sequence of registers wraps around to rO if required. 
If the four bytes of register rD+nr-l are only partially tilled, the unfilled low-order byte(s) 
of that register are cleared to 0. 

If rA is in the range of registers specified to be loaded, it will be skipped in the load process. 
If operand r A=(), the register is not considered as used for addressing, and will be loaded. 

Other registers altered: 
• None 
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Iswx 

Load String Word Indexed 

Iswx rD,rA,rB 

[POWER mnemonic: Isx] 



Iswx 

Integer Unit 



m Reserved 



31 


D 


A 


B 


533 


III 



5 6 



10 11 



15 16 



20 21 



30 31 



if rA=0 then b^O 

else b<— rA 

EA<-b+rB 

n<-XER[25-311 

r^rD - 1 

i<-32 

do while n > 

if i=32 then 

r<-r+l (mod 32) 

GPR(r)<-0 

GPR(r)[i-i+7]f-MEM(EA, 1) 
i<-i+8 

EA is the sum (rAIO)+(rB). Let n=XER[25-31]; n is the number of bytes to load. Let 
nr=CEYL{nlA): nr is the number of registers to receive data. 

If /!>(), n consecutive bytes starting at EA are loaded into GPRs rD through rD+m'-l . 

Bytes are loaded left to right in each register. The sequence of registers wraps around 
through rO if required. If the bytes of rD+nr-l are only partially filled, the unfilled low- 
order byte(s) of that register are cleared to 0. 

If /z=(), the content of rD is undefined. 

If r A and rB are in the range of registers specified to be loaded, it will be skipped in the 
load process. If operand rA=(), the register is not considered as used for addressing, and 
will be loaded. 

Other registers altered: 

• None 
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Iwarx 

Load Word and Reserve Indexed 



Iwarx 



rD,rA,rB 



Iwarx 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



[HI Reserved 



31 


D 


A 


B 


20 






30 31 



if rA=0 then b<-0 

else b<— rA 

EA<— b+rB 

RESERVE<-1 

RESERVE_ADDR<-func(EA) 

rD<-MEM(EA,4) 

EA is the sum (rAlO) + (rB). The word in memory addressed by EA is loaded into rD. 

This instruction creates a reservation for use by a store word conditional instruction. An 
address computed from the EA is associated with the reservation, and replaces any address 
previously associated with the reservation: the manner in which the address to be associated 
with the reservation is computed from the EA is described in Section 3.1.1, "Effective 
Address Calculation". 

The EA must be a multiple of 4. If it is not, the aUgnment exception handler will be invoked 
if the load crosses a page boundary, or the results will be boundedly undefined. 

Other registers altered: 
• None 
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Iwbrx 

Load Word Byte-Reverse Indexed 

Iwbrx rD,rA,rB 

[POWER mnemonic: Ibrx] 



Iwbrx 

Integer Unit 



ID Reserved 



31 


D 


A 


B 


534 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



if rA=0 then b<-0 
else b<— rA 

EA<-b+rB 

rD<-MEM(EA+3, 1) II MEM(EA+2, 1) 
II MEM(EA+1, 1) II MEM(EA, 1) 

EA is the sum (rAIO)+(rB). Bits 0-7 of the word in memory addressed by EA are loaded 
into rD[24-31]. Bits 8-15 of the word in memory addressed by EA ai*e loaded into 
rD[16-23]. Bits 16-23 of the word in memory addressed by EA are loaded into rD[8-15], 
Bits 24-3 1 of the word in memory addressed by EA are loaded into rD[()-7]. 

The PowerPC architecture cautions programmers that some implementations of the 
architecture may run the Iwbrx instructions with greater latency than other types of load 
instructions. This is not the case in the MPC6()1 . This instruction operates with the same 
latency as other load instructions. 

Other registers altered: 
• None 
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Iwz 

Load Word and Zero 

Iwz rD,d(rA) 

[POWER mnemonic: 1] 



Iwz 

Integer Unit 



32 


D 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=0 then b<— 
else b<— rA 

EA<-b+EXTS(d) 
rD<-MEM(EA, 4) 

EA is the sum (rAlO) + d. The word in memory addressed by EA is loaded into rD. 

Other registers altered: 
• None 
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Iwzu 

Load Word and Zero with Update 

Iwzu rD,d(rA) 

[POWER mnemonic: lu] 



Iwzu 

Integer Unit 



33 


D 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=0 then b <- 
else b ^ (rA) 

EA<-b+EXTS(d) 
rD<-MEM(EA, 4) 
rA<-EA 

EA is the sum (rAlO) + d. The word in memory addressed by EA is loaded into rD. 

EA is placed into rA. 

If operand rA=() the MPC6()1 does not update register rO, or if rA=rD the load data is 
loaded into rD and the register update is suppressed. The PowerPC architecture defines 
load with update instructions with operand rA=0 or rA=rD as invalid forms. 

Other registers altered: 
• None 
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Iwzux 

Load Word and Zero with Update Indexed 

Iwzux rD,rA,rB 

[POWER mnemonic: lux] 



Iwzux 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



in Reserved 



31 


D 


A 


B 


55 


III; 



30 31 



if rA=0 then b ^ 
else b <- (rA) 

EA <r- b+(rB) 
rD^MEM(EA, 4) 
rA<-EA 

EA is the sum (rAIO)+(rB). The word in memory addressed by EA is loaded into rD. 

EA is placed into rA. 

If operand rA=() the MPC601 does not update register rO, or if rA=rD the load data is 
loaded into register rD and the register update is suppressed. The PowerPC architecture 
defines load with update instructions with operand rA=0 or rA=rD as invalid forms 

Other registers altered: 

• None 
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Iwzx 

Load Word and Zero Indexed 

Iwzx rD,rA,rB 

[POWER mnemonic: Ix] 



Iwzx 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



[ill Reserved 



31 


D 


A 


B 


23 






30 31 



if rA=0 then b<-0 
else b<— rA 

EA<-b+rB 
rDf-MEM(EA, 4) 



EA is the sum (rAlO) + (rB). The word in memory addressed by EA is loaded into rD, 

Other registers altered: 
• None 
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maSkgx power Architecture Instruction 

Mask Generate 



maskgx 

Integer Unit 



maskg 
maskg. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



31 


S 


A 


B 


29 


Re 



5 6 10 11 15 16 20 21 

This instruction is not part of the PowerPC architecture. 

Let mstart=rS[27-31], specifying the starting point of a masic of ones. Let 
mstop=rB[27-3 1 ], specifying the end point of the mask of ones. 

If mstart < mstop+ 1 then 

MASK(mstart..mstop)=ones 

MASK(all other bits)=zeros 
If mstart=mstop =1 then 

MASK(()-31)=ones 
If mstart>mstop+ 1 then 

M AS K(mstop+l.. mstart- l)=zeros 

MASK(aIl other bits)=ones 

MASK is then placed in rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specilic to the MPC601. 
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maSkirx power Architecture Instruction 

Mask Insert from Register 



maskir 
maskin 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



maskirx 

Integer Unit 



31 


S 


A 


B 


541 


Re 



5 6 



10 11 



15 16 



20 21 



This instruction is not part of the PowerPC architecture. 

Register rS is inserted into rA under control of the mask in rB. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601. 
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mcrf 

Move Condition Register Field 



mcrf 

integer Unit 



mcrf 



crfDxrfS 



m Reserved 



19 


crfD 


::|ii 


crfS 


WBIi 


OOODO 


DODODOOOOO 


III 







5 6 



8 9 10 11 



13 14 15 16 



20 21 



30 31 



CR[4*crfD:4*crfD+3] <- CR[4*crf5:4*crfS+3] 

The contents of condition register tield crfS are copied into condition register tield crfD. 
All other condition register fields remain unchanged. 

Note that if the link bit (bit 31) is set for this instruction, the PowerPC architecture 
considers the instruction to be of an invalid form. Relative to the MPC601 , this instruction 
executes and the link register is left in an undefined state. 

Note: Use of invalid instruction forms is not recommended. This description is provided 
for informational purposes only. 

Other registers altered: 

• Condition Register (CR field specified by operand crfD): 
Affected: LT, GT, EQ, SO 
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mcrfs 

Move to Condition Register from FPSCR 
mcrfs crfD,crfS 



mcrfs 

Floating-Point Unit 



m Reserved 



63 


crfD 


llll 


crfS 


IJIJII 


DOOOO 


64 






5 6 



8 9 10 11 



1314 15 16 



20 21 



30 31 



The contents of FPSCR tield crfS are copied to CR Field crfD. All other CR fields are 
unchanged. All exception bits copied are reset to zero in the FPSCR. 

Other registers altered: 

• Condition Register (CR Field specified by operand crfS): 

Affected: FX, OX (if crfS=()) 

Affected: UX, ZX, XX, VXSN AN (if erf S= 1 ) 

Affected: VXISI, VXIDI, VXZDZ, VXIMZ (if crfS=2) 

Affected: VXVC (ifcrfS=3) 

Affected: VXSOFT, VXSQRT, VXCVl (if crfS=5) 
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mcrxr 

Move to Condition Register from XER 



mcrxr 



crfD 



mcrxr 

Integer Unit 



5 6 



8 9 10 11 



15 16 



20 21 



HI Reserved 



31 


crtD 


00 


00000 


00000 


512 


ill 



30 31 



CR[4*crfD+3]<-XER[0-3] 
XER[0-31<- b'OOOO' 

The contents of XER[()-3] are copied into the condition register tield designated by crfD. 
All other fields of the condition register remain unchanged. XER[(>-3] is cleared to zero. 

Other registers altered: 

• Condition Register (CR Field specified by crfD operand): 
Affected: LT, GT, EQ, SO 

• XER[()-3]: 
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mfcr 

Move from Condition Register 



mfcr 



rD 



5 6 10 11 15 16 20 21 



mfcr 

Integer Unit 



ii] Reserved 



31 


D 


00000 


00000 


19 
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rD<-CR 

The contents of the condition register are placed into rD. 

Other registers altered: 
• None 
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mffsx 






Move from FPSCR 






mffs 


frD 


(Rc=0) 


mffs. 


frD 


(Rc=l) 



mffsx 

Integer Unit 



ID Reserved 



63 


frD 


00000 


00000 


583 


Re 



56 



10 11 



15 16 



20 21 



30 31 



The contents of the FPSCR are placed into bits 32-63 of register frD. Bits 0-31 of register 
frD are undefined. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc=l) 

POWER Compatibility Note: The PowerPC architecture defines bits (>-31 of floating- 
point register frD as undefined. In the MPC601 , these bits take on the value 
x'FFF8 0000'. 
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mfmsr 

Move from Machine State Register 



mfmsr 



rD 



mfmsr 

integer Unit 







5 6 10 11 



15 16 



20 21 



[HI Reserved 



31 


D 


00000 


00000 


83 






30 31 



rD<- MSR 

Tile contents of tiie MSR are placed into rD. 

Tills Is a supervisor-level instruction. 

Other registers altered: 
• None 
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mfspr 

Move from Special Purpose Register 



mfspr 



rD,SPR 



mfspr 

Integer Unit 



m Reserved 



31 



SPR 



339 



5 6 10 11 



20 21 



30 31 



n<-rD[5-9] II rD[0-4J 
rDf- SPR(a7) 

The SPR field denotes a special purpose register, encoded as shown in Table 10-4. The 
contents of the designated special purpose register are placed into rD. 

The value of SPR[()] is 1 if and only if reading the register is at the supervisor-level. 
Execution of this instruction specifying a supervisor-level register when MSR[PR]=1 will 
result in a supervisor-level instruction type program exception. 

If the SPR field contains a value that is not valid for the MPC601, the instruction form is 
invalid. For an invalid instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor- 
level instruction type program exception will occur instead of an no-op. 

Other registers altered: 
• None 



Table 10-4. 


SPR Encodings for mfspr 


SPR* 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-^] 





00000 


00000 


MQ 


User 


1 


00000 


00001 


XER 


User 


4 


00000 


00100 


RTCU 


User 


5 


00000 


00101 


RTCL 


User 


6 


00000 


00110 


DEC 


User 


8 


00000 


01000 


LR 


User 


9 


00000 


01001 


CTR 


User 


18 


00000 


10010 


DSISR 


Supervisor 


19 


00000 


10011 


DAR 


Supervisor 


22 


00000 


10110 


DEC 


Supervisor 
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Table 10-4. SPR Encodings for mfspr(Continued) 



SPR* 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


25 


00000 


11001 


SDR1 


Supervisor 


26 


00000 


11010 


SRRO 


Supervisor 


27 


00000 


11011 


SRR1 


Supervisor 


272 


01000 


10000 


SPRGO 


Supervisor 


273 


01000 


10001 


SPRG1 


Supervisor 


274 


01000 


10010 


SPRG2 


Supervisor 


275 


01000 


10011 


SPRG3 


Supervisor 


282 


01000 


11010 


EAR 


Supervisor 


287 


01000 


11111 


PVR 


Supervisor 


528 


10000 


10000 


BATOU 


Supervisor 


529 


10000 


10001 


BATOL 


Supervisor 


530 


10000 


10010 


BAT1U 


Supervisor 


531 


10000 


10011 


BAT1L 


Supervisor 


532 


10000 


10100 


BAT2U 


Supervisor 


533 


10000 


10101 


BAT2L 


Supervisor 


534 


10000 


10110 


BAT3U 


Supervisor 


535 


10000 


10111 


BAT3L 


Supervisor 


1008 


11111 


10000 


Cliecl<stop 

Register 

(HIDO) 


Supervisor 


1009 


11111 


10001 


Debug Mode 

Register 

(HID1) 


Supervisor 


1010 


11111 


10010 


lABR (HID2) 


Supervisor 


1013 


11111 


10101 


DABR (HID5) 


Supervisor 


1023 


11111 


11111 


PIR(HiD15) 


Supervisor 



*Note that tiie order of the two 5-bit halves of the SPR number is reversed compared with actual 
instruction coding. 

If the SPR field contains any value other than one of these implementation-specific values or one 
of the values shown in Table 3-40, the instruction form is invalid. 

spr[0]=1 if and only if writing the register is supervisor-level. Execution of this instruction specifying 
a defined and supervisor-level register when MSR[PR]=1 results in a privilege violation type pro- 
gram exception. 
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For mtspr and mfspr instructions, the SPR number coded in assembly language does not appear 
directly as a 10-bit binary number in the instruction. The number coded is split into two 5-bit halves 
that are reversed in the instruction, with the high-order 5 bits appearing in bits 16-20 of the instruc- 
tion and the low-order 5 bits in bits 11-15. 

SPR encodings for the DEC, MQ, RTCLand RTCU registers are not part of the PowerPC architec- 
ture. 
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mfsr 

Move from Segment Register 



mfsr 



rD,SR 



mfsr 

Integer Unit 



m Reserved 



31 


D 


ill 


SR 


■ 00000 


595 


ijj 



5 6 



10 11 12 



15 16 



20 21 



30 31 



rD4-SEGREG(SR) 

The contents of segment register SR is placed into rD. 

This is a supervisor-level instruction. 

This instruction is defined only for 32-bit implementations; using it on a 64-bit 
implementation causes an illegal instruction type program exception. 

Other registers altered: 
• None 
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mfsrin 

Move from Segment Register Indirect 



mfsrin 



rD,rB 



mfsrin 

Integer Unit 



m Reserved 



31 


D 


00000 


B 


659 


III 



56 10 11 



15 16 



20 21 



30 31 



rD^SEGREG(rB[0-3]) 

The contents of the segment register selected by bits ()-3 of rB are copied into rD. 

This is a supervisor-level instruction. 

This instruction is defined only for 32-bit implementations. Using it on a 64-bit 
implementation causes an illegal instruction exception. 

Other registers altered: 
• None 
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mtcrf 

Move to Condition Register Fields 



mtcrf 



CRM,rS 



mtcrf 

Integer Unit 



5 6 



10 11 12 



19 20 21 



pfj Reserved 



31 


8 


III 


CRM 


ill 


144 






30 31 



mask<-(4)(CRM[0]) II (4)(CRMl 1]) II... (4)(CRM[71) 
CR<-(rS[32-631 & mask) I (CR & -tnask) 

The contents of rS are placed into the condition register under control of the tield mask 
specified by CRM. The field mask identifies the 4-bit fields affected. Let i be an integer in 
the range ()-7. If CRM(i) = 1, CR Field i (CR bits 4*i through 4*i+3) is set to the contents 
of the corresponding tield of the of rS. 

Other registers altered: 
CR fields selected by mask 
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mtfsbOx 




Move to FPSCR Bit 




intfsbO crbD 


(Rc=0) 


mtfsbO. crbD 


(Rc=l) 



mtfsbOx 

Integer Unit 



HI Reserved 



63 


crbD 


00000 


00000 


70 


Re 







5 6 10 11 15 16 20 21 30 31 

Bit crbD of the FPSCR is cleared to zero. All other bits of the FPSCR are unchanged. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Floating-point Status and Control Register: 
Affected: FPSCR bit crbD 

Note: Bits 1 and 2 (FEX and VX) cannot be explicitly reset. 
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mtfsblx 




Move to FPSCR Bit 1 




mtfsbl crbD 


(Rc=0) 


mtfsbl. crbD 


(Rc=l) 



mtfsbl 



Integer Unit 



lii Reserved 



63 


CTbD 


OQOQO 


OOODO 


38 


Rc 



5 6 10 11 15 16 20 21 30 31 

Bit crbD of the FPSCR is set to one. All other bits of the FPSCR are unchanged. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Floating-point Status and Control Register: 
FPSCR bit crbD 

Note: Bits 1 and 2 (FEX and VX) cannot be explicitly reset. 
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mtfsfx 




Move to FPSCR Fields 




mtfsf FM,frB 


(Rc=()) 


mtfsf. FM,frB 


(Rc=l) 



mtfsfx 

Integer Unit 



5 6 7 



14 15 16 



20 21 



[ill Reserved 



63 


III 


FM 


■ 


frB 


711 


Re 



30 31 



Bits 32-63 of register frB are placed into the FPSCR under control of the tield mask 
specified by FM. The field mask identifies the 4-bit fields affected. Let i be an integer in the 
range 0-7. If FM(i) = 1 , FPSCR Field i (FPSCR bits 4*i through 4*i+3) is set to the contents 
of the corresponding field of the low-order 32 bits of register frB. 

The other PowerPC implementations, the move to FPSCR fields (intfsf) instruction may 
perform more slowly when only a portion of the fields are updated. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• Floating-point Status and Control Register: 

FPSCR fields selected by mask 

Updating fewer than all eight fields of the FPSCR may have substantially poorer 
performance on some implementations than updating all the fields. 

When FPSCR[0-3] is specified, bits (FX) and 3 (OX) are set to the values of frB[32] and 
frB[35] (i.e., even if this instruction causes OX to change from to 1, FX is set from 
frB[32] and not by the usual rule that FX is set to 1 when an exception bit changes from 
to 1). Bits 1 and 2 (FEX and VX) are set according to the usual rule and not from 
frB[33-34]. 
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mtfsfix 

Move to FPSCR Field Immediate 



mtfsfix 

Integer Unit 



mtfsfi 
mtfsfi. 




crfD,IMM 
crfD,IMM 


(R 
(R 


C=()) 
C=l) 






















llij Reserved 


63 


crfD 


lillll 


OOODO 


IMM 


III 


134 


Re 





5 


6 8 


9 10 


1112 15 16 19 20 21 




30 31 



The value of the IMM tield is placed into FPSCR field crfD. All other FPSCR fields are 
unchanged. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Floating-point Status and Control Register: 

FPSCR field crfD 

When FPSCR[()-3] is specified, bits (FX) and 3 (OX) are set to the values of IMM[0] and 
IMM[3] (i.e., even if this instruction causes OX to change from to 1, FX is set from 
IMM[0] and not by the usual rule that FX is set to 1 when an exception bit changes from 
to 1). Bits 1 and 2 (FEX and VX) are set according to the usual rule, given in Section 2,2.3, 
"Floating-Point Status and Control Register (FPSCR)" and not from IMM[l-2]. 
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mtmsr 

Move to Machine State Register 



mtmsr 



rS 



mtmsr 

Integer Unit 







5 6 



10 11 



15 16 



20 21 



[ill Reserved 



31 


S 


00000 


00000 


146 


II 



30 31 



MSR<-rS(0-31] 

The contents of rS are placed into the MSR. 

This is a supervisor-level instruction and context synchronizing. See Section 3.1.2, 
"Context Synchronization" for the definition of context synchronization. 

Other registers altered: 
MSR 
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mtspr 

Move to Special Purpose Register 



mtspr 



SPR,rS 



mtspr 

Integer Unit 



5 6 



10 11 



20 21 



[if] Reserved 



31 


D 


SPR 


467 


IM 



30 31 



n=rD15-91llrD[0-41 
SPREG(n)<-rS[0-31] 

The SPR tield denotes a special purpose register, encoded as shown in Table 10-4. The 
contents of rS are placed into the designated special purpose register. 

The value of SPR[()] is 1 if and only if writing the register is a supervisor-level operation. 
Execution of this instruction specifying a defined and supervisor-level register when 
MSR[PR]=1 results in a supervisor-level instruction exception. 

If the rS tield contains an invalid value, the instruction form is invalid. For an invalid 
instruction form in which SPR[0]=1, if MSR[PR]=1 a supervisor-level instruction 
exception will occur instead of a no-op. 

Other registers altered: 

None 
Table 10-4 lists the SPR encodings for the MPC601. 



Table 10-5. 


SPR Encodings for mtspr 


SPR* 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 





00000 


00000 


MQ 


User 


1 


00000 


00001 


XER 


User 


4 


00000 


00100 


RTGU 


User 


5 


00000 


00101 


RTCL 


User 


6 


00000 


00110 


DEC 


User 


8 


00000 


01000 


LR 


User 


9 


00000 


01001 


CTR 


User 


18 


00000 


10010 


DSISR 


Supervisor 
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Table 10-5. SPR Encodings for mtspr(Continued) 



SPR* 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


19 


00000 


10011 


DAR 


Supervisor 


22 


00000 


10110 


DEC 


Supervisor 


25 


00000 


11001 


SDR1 


Supervisor 


26 


00000 


11010 


SRRO 


Supervisor 


27 


00000 


11011 


SRR1 


Supervisor 


272 


01000 


10000 


SPRGO 


Supervisor 


273 


01000 


10001 


SPRG1 


Supervisor 


274 


01000 


10010 


SPRG2 


Supervisor 


275 


01000 


10011 


SPRG3 


Supervisor 


282 


01000 


11010 


EAR 


Supervisor 


528 


10000 


10000 


BATOU 


Supervisor 


529 


10000 


10001 


BATOL 


Supervisor 


530 


10000 


10010 


BAT1U 


Supervisor 


531 


10000 


10011 


BAT1L 


Supervisor 


532 


10000 


10100 


BAT2U 


Supervisor 


533 


10000 


10101 


BAT2L 


Supervisor 


534 


10000 


10110 


BAT3U 


Supervisor 


535 


10000 


10111 


BAT3L 


Supervisor 


1008 


11111 


10000 


Cliecl<stop 
Register 
(HI DO) 


Supervisor 


1009 


11111 


10001 


Debug Mode 

Register 

(NIDI) 


Supervisor 


1010 


11111 


10010 


lABR (HID2) 


Supervisor 


1013 


11111 


10101 


DABR (HID5) 


Supervisor 


1023 


11111 


11111 


PIR(HID15) 


Supervisor 
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mtsr 

Move to Segment Register 



mtsr 



SR,rS 



mtsr 

Integer Unit 



pi] Reserved 



31 



SR 



00000 



210 



5 6 



10 11 12 



15 16 



20 21 



30 31 



SEGREG(SR)<-(rS) 

The contents of rS is placed into SR. 

This is a supervisor-level instruction. 

This instruction is detined only for 32-bit implementations. Using it on a 64-bit 
implementation causes an illegal instruction type program exception. 

Other registers altered: 
• None 
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mtsrin 

Move to Segment Register Indirect 

mtsrin rS,rB 

[POWER mnemonic: mtsrl] 



5 6 



10 11 



15 16 



20 21 



mtsrin 

Integer Unit 



lil Reserved 



31 


S 


00000 


B 


242 






30 31 



SEGREG(rB[0-31)<-(rS) 

The contents of rS are copied to the segment register selected by bits ()-3 of rB. 

This is a supervisor-level instruction. 

This instruction is defined only for 32-bit implementations. Using it on a 64-bit 
implementation causes an illegal instruction exception. 

Other registers altered: 
• None 
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mulx 


POWER Architecture 


Multiply 






mul 


rD,rA,rB 


(0E=() Rc=0) 


mul. 


rD,rA,rB 


(OE=()Rc=l) 


mulo 


rD,rA,rB 


(0E=1 Rc=0) 


mulo. 


rD,rA,rB 


(0E=1 Rc=l) 



mulx 

Integer Unit 



31 


D 


A 


B 


OE 


107 


Re 



5 6 10 11 15 16 20 21 22 

This instruction is not part of the PowerPC architecture. 



30 31 



Bits 0-3 1 of the product (rA)*(rB) are placed into rD. Bits 32-63 of the product (rA)*(rB) 
are placed into the MQ register. 

If Rc=l, then LT,GT and EQ reflect the result in the MQ register (the low order 32 bits). If 
0E=1 then SO and OV are set to one if the product cannot be represented in 32 bits. 

If the smaller absolute value of the two multipliers is placed in rB, the instruction may 
complete execution more quickly. See 7.3.2.1, "Integer Instructions Timing Examples" for 
additional information about instruction performance. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO, OV (if 0E=1) 

Note: This instruction is specilic to the MPC601. 
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mulhwx 

Multiply High Word 



mulhw 
mulhw. 



rD,rA,rB 
rD,rA,rB 



(Rc=()) 
(Rc=l) 



5 6 



10 11 



15 16 



20 21 22 



mulhwx 

Integer Unit 



ill Reserved 



31 


D 


A 


B 


ill 


75 


Re 



30 31 



pn)d[0-63l<-rA[32-63]*rB[32-63] 

rD[32-63]<-prod[0-31] 

rD[0-31]<-undefined 

The contents of rA and of rB are interpreted as 32-bit signed integers. They are multiplied 
to fonn a 64-bit signed integer product. The high-order 32 bits of the 64-bit product are 
placed into rD. 

If the smaller absolute value of the two multipliers is placed in rB, the instruction may 
complete execution more quickly. See 7.3.2.1, "Integer Instructions Timing Examples" for 
additional information about instruction performance. 

Other registers altered; 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 
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mulhwux 

Multiply High Word Unsigned 



mulhwu 
mulhwu. 



rD,rA,rB 
rD,rA,rB 



(Rc=()) 
(Rc=l) 



mulhwux 

Integer Unit 



5 6 



10 11 



15 16 



20 21 22 



liil Reserved 



31 


D 


A 


B 


jjj 


11 


Re 



30 31 



prodl0-63]<-rA[32-63]*rB132-63] 

rD[32-63]<-prod[0-311 

rD[0-3l]<-undefined 

The contents of rA and of rB are extracted and interpreted as 32-bit unsigned integers. 
They are multiplied to form a 64-bit unsigned integer product. The high-order 32 bits of the 
64-bit product are placed into rD. 

If the smaller absolute value of the two multipliers is placed in rB, the instruction may 
complete execution more quickly. See 7.3.2.1, "Integer Instructions Timing Examples" for 
additional information about instruction performance. 

This instruction causes the contents of the MQ to become undefined. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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mullwx 






Multiply Low 






mullw 


rD,rA,rB 


(0E=() Rc=()) 


mullw. 


rD,rA,rB 


(OE=0Rc=l) 


mull wo 


rD,rA,rB 


(0E=1 Rc=0) 


mull wo. 


rD,rA,rB 


(0E=1 Rc=l) 



[POWER mnemonics: muls, muls., mulso, mulso.] 



mullwx 

Integer Unit 



31 


D 


A 


B 


OE 


235 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD^rA[32-63]*rB[32-63] 

The low-order 32 bits of the 64-bit product (rA)*(rB) are placed into rD. The low-order 
bits of the 32-bit product are independent of whether the operands are treated as signed or 
unsigned integers. However, OV is set based on the result interpreted as a signed integer. 

If the smaller absolute value of the two multipliers is placed in rB, the instruction may 
complete execution more quickly. See 7.3.2.1, "Integer Instructions Timing Examples" for 
additional information about instruction performance. 

If 0E=1 , then OV is set to one if the product cannot be represented in 32 bits. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: SO, OV (ifOE=l) 
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mulli 

Multiply Low Immediate 

muIII rD,rA,SIMM 

[POWER mnemonic: muli] 



mulli 

Integer Unit 



07 


D 


A 


SIMM 



5 6 



10 11 



15 16 



31 



prod[0^8]<-rA*SIMM 
rD<-prodI16^8] 

The low-order 32 bits of the 48-bit product (rA)*SIMM are placed into rD. The low-order 
bits of the 32-bit product are independent of whether the operands are treated as signed or 
unsigned integers. 

Other registers altered: 
• None 
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nabsx 


POWER Architecture 


Negative Absolute 






nabs 


rD,rA 


(OE=0 Rc=()) 


nabs. 


rD,rA 


(OE=()Rc=l) 


nabso 


rD,rA 


(0E=1 Rc=0) 


nabso. 


rD,rA 


(0E=1 Rc=l) 



nabsx 

Integer Unit 



[ill Reserved 



31 


D 


A 


00000 


OE 


488 


Re 



5 6 10 11 15 16 20 21 22 30 31 

This instruction is not part of the PowerPC architecture. 

The negative absolute value -l(rA)l is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: SO, OV (if 0E=1) 

Note that nabs never overflows. If 0E=1 then XER(OV) is cleared to zero and XER(SO) 
is not changed. 

Note: This instruction is specific to the MPC6()1 . 
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nandx 






NAND 






nand 


rA,rS,rB 


(Rc=()) 


nand. 


rA,rS,rB 


(Rc=l) 



nandx 

Integer Unit 



31 


S 


A 


B 


476 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



rA<- -((rS) & (rB)) 

The contents of rS are ANDed with the contents of rB and the one's complement of the 
result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

NAND with rA=rB can be used to obtain the one's complement. 
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negx 

Negate 



neg 
neg. 
nego 
nego. 



5 6 



rD,rA 
rD,rA 
rD,rA 
rD,rA 



(0E=() Rc=()) 
(OE=0Rc=l) 
(0E=1 Rc=0) 
(0E=1 Rc=l) 



10 11 



15 16 



20 21 22 



negx 

Integer Unit 



liil Reserved 



31 


D 


A 


00000 


OE 


104 


Re 



30 31 



rD< — (rA) + 1 

The sum -(rA) + 1 is placed into rD. 

If rA contains the most negative 32-bit number (x '80(X)_0(){)0'), the low-order 32 bits of 
the result contain the most negative 32-bit number and, if 0E=1, OV is set. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: SO OV (ifOE=l) 
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norx 

NOR 

nor 
nor. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



norx 

Integer Unit 



31 


S 


A 


B 


124 


Re 







5 6 



10 11 



15 16 



20 21 



30 31 



rA<-^((rS)l(rB)) 

The contents of rS are ORed with the contents of rB and the one's complement of the result 
is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 
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orx 






OR 






or 


rA,rS,rB 


(Rc=()) 


or. 


rA,rS,rB 


(Rc=l) 



orx 

Integer Unit 



31 


S 


A 


B 


444 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



rA^(rS) I (rB) 

The contents of rS is ORed with the contents of rB and the result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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orcx 

OR with Complement 



orcx 

Integer Unit 



ore 
ore. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



31 


S 


A 


B 


412 


Re 







5 6 



10 11 15 16 



20 21 



30 31 



rA ^ (rS) I -,(rB) 

The contents of rS is ORed with the complement of the contents of rB and the result is 
placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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on 

OR Immediate 

orl rA,rS,UIMM 

[POWER mnemonic: oril] 



on 

Integer Unit 



24 


S 


A 


UIMM 







5 6 



10 11 



15 16 



31 



rAf-(rS)l((16)0IIUIMM) 

The contents of rS is ORed with x'OOOO' II UIMM and the result is placed into rA. 

The preferred "no-op" (an instruction that does nothing) is: 

ori 0,0,0 

Other registers altered: 
• None 
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oris 

OR Immediate Shifted 

oris rA,rS,UIMM 

[POWER mnemonic: orlu] 



5 6 



10 11 



15 16 



oris 

Integer Unit 



25 


S 


A 


UIMM 



31 



rA<-(rS) I (UIMM II (16)0) 

The contents of rS is ORed with UIMM II x'OOOO' and the result is placed into rA. 

Other registers altered: 
• None 
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rfi 

Return from Interrupt 



5 6 



10 11 



15 16 



20 21 



rfi 

Integer Unit 

|ii] Reserved 



19 


DODOO 


;|||||||||i||< 


00000 


50 


ill 



30 31 



MSR[ 16-31 ]<-SRR 1(16-31] 
NIA<-SRR0[0-29] II ObOO 

Bits 16-31 of SRRl are placed into bits 16-31 of the MSR, then the next instruction is 
fetched, under control of the new MSR value, from the address SRR()[()-29] II b'OO'. 

This is a supervisor-level instruction and is context synchronizing. 

Other registers altered: 
MSR 
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rimix POWER Architecture Instruction 

Rotate Left then Mask Insert 



rlmi rA,rS,rB,MB,ME 
riml. rA,rS,rB,MB,ME 



(Rc=()) 
(Rc=l) 



rImix 

Integer Unit 



22 


S 


A 


B 


MB 


ME 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



This instruction is not partof tlie PowerPC architecture. 

The contents of rS is rotated left the number of positions specified by bits 27-3 1 of rB. The 
rotated data is inserted into rA under control of the generated mask. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC6()1. 
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rlwimix 

Rotate Left Word Immediate then Mask Insert 

rlwimi rA,rS,SH,MB,ME (Rc=0) 

rlwimi. ■ rA,rS,SH,MB,ME (Rc=l) 

[POWER mnemonics: rlimi, rlimi.] 



rlwimix 

Integer Unit 



20 


8 


A 


SH 


MB 


ME 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



n<-SH 

r<-ROTL(rS, n) 
m<-MASK(MB, ME) 
rA<-(r&M)l(rA&-m) 

The contents of rS are rotated left SH bits. A mask is generated having 1-bits from bit MB 
through bit ME and O-bits elsewhere. The rotated data is inserted into rA under control of 
the generated mask. 



Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(ifRc=l) 
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rlwinmx 

Rotate Left Word Immediate then AND with Mask 

rlwinm rA,rS,SH,MB,ME (Rc=0) 

rlwinm. rA,rS,SH,MB,ME (Rc=l) 

[POWER mnemonics: rlinm, rlinm.] 



rlwinmx 

integer Unit 



21 


S 


A 


SH 


MB 


ME 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



n^SH 

r<-R()TL(rS, n) 
m<-MASK(MB, ME) 
rA<— r & ni 

The contents of rS are rotated left SH bits. A mask is generated iiaving 1-bits from bit MB 
through bit ME and O-bits elsewhere. The rotated data is ANDed with the generated mask 
and the result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

The opcode rlwinm can be used to extract an n-bit field, that starts at bit position h in rS[()- 
31], right-justified into rA (clearing the remaining 32-n bits of rA), by setting SU=h+n, 
MB=32-rt, and ME=31 . It can be used to extract an n-bit field, that starts at bit position h in 
rS[()-31], left-justified into rA (clearing the remaining 32-n bits of rA), by setting SH=/;, 
MB=(), and ME=/i-l. It can be used to rotate the contents of a register left (or right) by n 
bits, by setting SH=/i {32-n), MB=0, and ME=31. It can be used to shift the contents of a 
register right by n bits, by setting SH=32-N, MB=/2, and ME=3 1. It can be used to clear the 
high-order h bits of a register and then shift the result left by n bits by setting SH=rt, MB=/;- 
n and ME=31-/7. It can be used to clear the low-order n bits of a register, by setting SH=(), 
MB=0, and ME=3 1-/1. 
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rlwnmx 

Rotate Left Word then AND with Mask 

rlwnm rA,rS,rB,MB,ME (Rc=0) 

rlwnm. rA,rS,rB,MB,ME (Rc=l) 

[POWER mnemonics: rlnm, rinm.] 



rlwnmx 

Integer Unit 



23 


S 


A 


B 


MB 


ME 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



n4-rB[27-31) 
r<-R(:)TL(rS, n) 
m<-MASK(MB,ME) 
rA*-r & ni 

The contents of rS are rotated left the number of bits specified by rB[27-31]. A mask is 
generated having 1 -bit from bit MB through bit ME and O-bits elsewhere. The rotated data 
is ANDed with the generated mask and the result is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

The opcode rlwnm can be used to extract an n-bit field, that starts at variable bit position 
/; in rS[()-31], right-justified into rA (clearing the remaining 32-n bits of rA), by setting 
rB[27-31]=/?+/7, MB=32-n, and ME=31. It can be used to extract an n-bit field, that starts 
at variable bit position h in rS[()-31 ], left-justified into rA (clearing the remaining 32-n bits 
of rA), by setting rB[27-31]=6, MB = 0, and ME=n-l. It can be used to rotate the contents 
of a register left (or right) by variable n bits, by setting rB[27-31]=/2 (32-N), MB=0, and 
ME=31. 

Equivalent mnemonics are provided for some of these uses. 
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rri bx power Architecture Instruction 

Rotate Right and Insert Bit 



rrib 
rrib. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



rribx 

integer Unit 



31 


S 


A 


B 


537 


Re 



5 6 10 11 15 16 20 21 

This instruction is not part of the PowerPC architecture. 



30 31 



Bit of rS is rotated right the amount specitied by bits 27-3 1 of rB. The bit is then inserted 
into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC601. 
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sc sc 

System Call Integer Unit 



[POWER mnemonic: svca] 



[if] Reserved 



17 


00000 


00000 


OOOOODODOOOOODO 


1 


III 



5 6 10 11 15 16 29 30 31 

This instruction calls the operating system to perform a service. When control is returned 
to the program that executed the system call, the content of the registers depends on the 
register conventions used by the program providing the system service. 

This instruction is context synchronizing, as described in Section 3.1.2, "Context 
Synchronization". Although the PowerPC architecture considers sc to be a branch 
processor instruction, it is executed by the integer processor in the MPC601 . 

Other registers altered: 
Dependent on the system service 

POWER Compatibility Note: The PowerPC sc instruction is substantially different from 
the POWER SVC instruction. The following aspects of these instructions were considered 
with respect to POWER compatibility: 

PowerPC defines the sc instruction with the "LK" bit set to be an invalid form. POWER 
architecture defines the svc instruction (same opcode as PowerPC sc instruction) with the 
"LK" bit set as a valid form which places the address of the instruction following the svc 
into the link register. In the case of MPC601 , an sc instruction with the "LK" bit set will 
execute correctly (as defined in PowerPC) and will update the link register with the address 
of the instruction following the sc instruction. 

PowerPC defines the sc instruction in such a manner that requires bit 30 of the instruction 
to be b'V (when bit 30 is b'O', the instruction is considered reserved). The POWER 
architecture svc instruction does not have such a restriction, and uses this bit to define an 
alternate form of the svc instruction. Although the MPC601 does not support this alternate 
form of the svc instruction, it does ignore the state of bit 30 of the instruction during decode 
and execution. 

As a result of executing an sc instruction, PowerPC defines bits 0-15 of register SRRl to 
be undefined. In the case of MPC601, execution of the sc instruction will cause bits 16-31 
of the instruction to be placed into bits 0-15 of register SRRl. 

The effective address of the instruction following the system call instruction is placed into 
SRRO. Bits 16-31 of the MSR are placed into bits 16-31 of SRRl. 
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Then a system call exception is generated. The exception causes the MSR to be altered as 
described in Section 5.4, "Exception Detinitions". 

The exception causes the next instruction to be fetched from offset x'COO' from the 
physical base address indicated by the new setting of MSR[IP]. This instruction is context- 
synchronizing. 

Other registers altered: 
SRROSRRl MSR 
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sle 



POWER Architecture Instruction 



Shift Left Extended 



sle 
sle. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



slex 

Integer Unit 



31 
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B 
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Re 
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10 11 



15 16 



20 21 



30 31 



This instruction is not part of the PowerPC architecture. 

Register rS is rotated left n bits wiiere n is the shift amount specified in bits 27-31 of rB. 
The rotated word is placed in the MQ register. A mask of 32-/2 ones followed by n zeros is 
generated. The logical AND of the rotated word and the generated mask is placed in rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601. 
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SleC|x POWER Architecture Instruction 

Shift Left Extended with MQ 



sleq 
sleq. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



sleqx 

integer Unit 



31 
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B 
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Re 
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10 11 



15 16 



20 21 



30 31 



This instruction is not part of the PowerPC architecture. 

Register rS is rotated left n bits where n is the shift amount specified in bits 27-31 of rB. 
A mask of 32-/2 ones followed by n zeros is generated. The rotated word is then merged 
with the contents of the MQ register, under control of the generated mask. The merged 
word is placed in r A. The rotated word is placed in the MQ register. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601. 
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SliC|x POWER Architecture Instruction 

Shift Left Immediate with MQ 



sliq 
sliq. 



rA,rS,SH 
rA,rS,SH 



(Rc=0) 
(Rc=l) 



sliqx 

Integer Unit 



31 
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A 


SH 
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Re 
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This instruction is not part of the PowerPC architecture. 

Register rS is rotated left n bits where n is the shift amount specified by SH. The rotated 
word is placed in the MQ register. A mask of 32-/2 ones followed by n zeros is generated. 
The logical AND of the rotated word is placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC601. 
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SHiC|x POWER Architecture Instruction 

Shift Left Long Immediate with MQ 



slliq 
slliq. 



rA,rS,SH 
rA,rS,SH 



(Rc=()) 
(Rc=l) 



slliqx 

Integer Unit 



31 


S 


A 


SH 
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Re 
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15 16 



20 21 



30 31 



This Instruction is not part of the PowerPC architecture. 

Register rS is rotated left n bits where n is the shift amount specified by SH. A mask of 
32-/1 ones followed by n zeros is generated. The rotated word is then merged with the 
contents of the MQ register, under control of the generated mask. The merged word is 
placed into rA. The rotated word is placed into the MQ register. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601 . 



MOTOROLA 



Chapter 10. Instruction Set 



10-159 



SlICjx POWER Architecture Instruction 

Shift Left Long with MQ 



sllqx 

Integer Unit 



sllq 
sllq. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



31 


S 


A 


B 


216 


Re 



5 6 10 11 15 16 20 21 30 31 

This instruction is not part of the PowerPC architecture. 

Register rS is rotated left n bits where n is the shift amount specified in bits 27-3 1 of rB. 

When bit 26 of rB is a zero, a mask of 32-n ones followed by n zeros is generated. A word 
of zeros is then merged with the contents of the MQ register, under control of the generated 
mask. 

When bit 26 of rB is a one, a mask of 32-n ones followed by n ones is generated. A word 
of zeros is then merged with the contents of the MQ register, under control of the generated 
mask. 

The merged word is placed into rA. The MQ register is not altered. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC601. 
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slqx 

Shift Left with MQ 

slq 
slq. 



POWER Architecture Instruction 



slqx 

integer Unit 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



31 


S 


A 


B 
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Re 



5 6 10 11 15 16 20 21 30 31 

This instructiun is not part of the PowerPC architecture. 

Register rS is rotated left n bits where n is the shift amount specified in bits 27-31 of rB. 
The rotated word is placed in the MQ register. 

When bit 26 of rB is a zero, a mask of 32-n ones followed by n zeros is generated. 

When bit 26 of rB is a one, a mask of all zeros is generated. 

The logical AND of the rotated word and the generated mask is placed into r A. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC6()1. 
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SlWx 






Shift Left Word 






slw 


rA,rS,rB 


(Rc=0) 


slw. 


rA,rS,rB 


(Rc=l) 



[POWER mnemonics: si, si.] 



SlWx 

Integer Unit 
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B 


24 


Re 
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10 11 



15 16 



20 21 



30 31 



n<-rB[27-31] 
rA<-ROTL(rS, n) 

If bit 16 of rB=0, the contents of rS are shifted left the number of bits specified by rB[26- 
3 1 ]. Bits shifted out of position are lost. Zeros are supplied to the vacated positions on the 
right. The 32-bit result is placed into rA. If bit 16 of rB=l, 32 zeros are placed into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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SraCjX POWER Architecture Instruction 

Shift Right Algebraic with MQ 



sraq 
sraq. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



sraqx 

Integer Unit 



31 
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B 
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5 6 



10 11 



15 16 



20 21 



30 31 



This instruction is not part of the PowerPC architecture. 



Register rS is rotated left 32-/2 bits where n is tiie shift amount specilied in bits 27-31 of 
rB. When bit 26 of rB is a zero, a mask of n zeros followed by 32-n ones is generated. 
When bit 26 of rB is a one, a mask of all zeros is generated. The rotated word is placed in 
the MQ register. The rotated word is then merged with a word of 32 sign bits from rS, under 
control of the generated mask. 

The merged word is placed in rA. 

The rotated word is ANDed with the complement of the generated mask. This 32-bit result 
is ORed together and then ANDed with bit of rS to produce XER[CA]. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: CA 

All shift right algebraic instructions can be used for a fast divide by 2(n) if followed with 
addze. 

Note: This instruction is specific to the MPC601. 
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SraiC|x power Architecture Instruction 

Shift Right Algebraic Immediate with MQ 



sraiq 
sraiq. 



rA,rS,SH 
rA,rS,SH 



(Rc=0) 
(Rc=l) 



sraiqx 

Integer Unit 



31 


S 


A 


SH 


952 


Re 



5 6 10 11 15 16 20 21 

This instruction is not part of the PowerPC architecture. 



30 31 



Register rS is rotated left 32-n bits where n is the shift amount specified by SH. A mask 
of n zeros followed by 32-A2 ones is generated. The rotated word is placed in the MQ 
register. The rotated word is then merged with a word of 32 sign bits from rS, under control 
of the generated mask. 

The merged word is placed in rA. 

The rotated word is ANDed with the complement of the generated mask. This 32-bit result 
is ORed together and then ANDed with bit of rS to produce XERfCA]. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 

Affected: CA 

All shift right algebraic instructions can be used for a fast divide by 2{n) if followed with 
addze. 

Note: This instruction is specific to the MPC6()1. 
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srawx 

Shift Right Algebraic Word 



sraw rA,rS,rB 


(Rc=()) 


sraw. rA,rS,rB 


(Rc=l) 


[POWER mnemonics: sra, sra.] 





srawx 

integer Unit 
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30 31 



n^rB[27-31] 
rA<-R()TL(rS, n) 

If rB[26]=0,then the contents of rS are siiifted riglit tiie number of bits specitied by rB[27- 
31]. Bits siiifted out of position 31 are lost. The result is padded on the left with sign bits 
before being placed into rA. If rB[26]=l, then rA is filled with 32 sign bits (bit 0) from rS. 
CRO is set based on the value written into rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 
Affected: CA 
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srawi;^ 

Shift Right Algebraic Word Immediate 

srawl rA,rS,SH 

srawi. rA,rS,SH 

[POWER mnemonics: srai, sral.] 



srawix 

Integer Unit 



(Rc=()) 
(Rc=l) 



31 


S 


A 


SH 
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Re 
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10 11 



15 16 



20 21 



30 31 



m-SH 
rA<-ROTL(rS, 32-n) 

The contents of rS are shifted right SH bits. Bits shifted out of position 31 are lost. The 
shifted value is sign extended before being placed in rA. The 32-bit result is placed into rA. 
XER[CA] is set to 1 if rS contains a negative number and any 1 -bits are shifted out of 
position 31; otherwise XER[CA] is cleared to 0. A shift amount of zero causes XER[CA] 
to be cleared to 0. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 
Affected: CA 
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srex 

Shift Right Extended 



POWER Architecture Instruction 



sre 
sre. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



srex 

Integer Unit 



31 
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B 
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15 16 



20 21 



30 31 



This instructiun is not part of the PowerPC architecture. 

Register rS is rotated left 32-/2 bits wiiere n is the shift amount specified in bits 27-31 of 
rB. The rotated word is placed in the MQ register. A mask of n zeros followed by 32-/2 ones 
is generated. The logical AND of the rotated word and the generated mask is placed in rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC6()1. 
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Sreax power Architecture Instruction 

Shift Right Extended Algebraic 



srea 
srea. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



sreax 

integer Unit 



31 


S 


A 


B 


921 


Re 



5 6 10 11 15 16 20 21 30 31 

This instruction is not part of the PowerPC architecture. 

Register rS is rotated left 32-n bits where n is the shift amount specified in bits 27-3 1 of 
rB. A mask of n zeros followed by 32-n ones is generated. The rotated word is placed in 
the MQ register. The rotated word is then merged with a word of 32 sign bits from rS, under 
control of the generated mask. 

The merged word is placed in rA. 

The rotated word is ANDed with the complement of the generated mask. This 32-bit result 
is ORed together and then ANDed with bit of rS to produce XERfCA]. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 
Affected: CA 

Note: This instruction is specific to the MPC601. 
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SreC|x POWER Architecture Instruction 

Shift Right Extended with MQ 



sreq 
sreq. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



sreqx 

integer Unit 



31 


S 


A 


B 
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15 16 



20 21 



30 31 



This instruction is not part of the PowerPC architecture. 

Register rS is rotated left 32-n bits where n is the shift amount specified in bits 27-31 of 
rB. A mask of n zeros followed by 32-n ones is generated. The rotated word is then merged 
with the contents of the MQ register, under control of the generated mask. The merged 
word is placed in rA. The rotated word is placed into the MQ register. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC601. 
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SriC|Xx POWER Architecture Instruction 

Shift Right Immediate with MQ 



sriq 
sriq. 



rA,rS,SH 
rA,rS,SH 



(Rc=()) 
(Rc=l) 



sriqx 

Integer Unit 



31 


S 


A 


SH 
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Re 
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20 21 



30 31 



This instruction is not part of the PowerPC architecture. 

Register rS is rotated left 32-n bits where n is the shift amount specified by SH, The rotated 
word is placed into the MQ register. A mask of n zeros followed by 32-n ones is generated. 
The logical AND of the rotated word and the generated mask is placed in rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601. 



10-170 



PowerPC 601 RISC Microprocessor User's Manual 



MOTOROLA 



SrI iC|x POWER Architecture Instruction 

Shift Right Long Immediate with MQ 



srliq 
sriiq. 



rA,rS,SH 
rA,rS,SH 



(Rc=()) 
(Rc=l) 



sriiqx 

Integer Unit 



31 


S 


A 


SH 


760 


Re 



5 6 
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20 21 



30 31 



This instruction is not part of the PowerPC architecture. 



Register rS is rotated left 32-n bits where n is the shift amount specitied by SH. A mask of 
n zeros followed by 32-n ones is generated. The rotated word is then merged with the MQ 
register, under control of the generated mask. The merged word is placed in r A. The rotated 
word is placed into the MQ register. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC601. 
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SrlCjx POWER Architecture Instruction 

Shift Right Long with MQ 



sriq 
sriq. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



sriqx 

integer Unit 



31 


S 


A 


B 


728 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



This instruction is not part of the PowerPC architecture. 

Register rS is rotated left 32-n bits where n is the shift amount specified in bits 27-3 1 of rB. 
When bit 26 of rB is a zero, a masic of n zeros followed by 32-n ones is generated. The 
rotated word is then merged with the MQ register, under control of the generated mask. 

When bit 26 of rB is a one, a mask of n ones followed by 32-n zeros is generated. A word 
of zeros is then merged with the contents of the MQ register, under control of the generated 
mask. 

The merged word is placed in rA. The MQ register is not altered. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

Note: This instruction is specific to the MPC6()1 . 
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SrC|x POWER Architecture Instruction 

Shift Right with MQ 



srqx 

Integer Unit 



srq 
srq. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



31 


S 


A 


B 


664 


Re 



5 6 10 11 15 16 20 21 

This instruction is not part of the PowerPC architecture. 



30 31 



Register rS is rotated left 32-n bits where n is the shift amount specified in bits 27-3 1 of rB. 
The rotated word is placed into the MQ register. 

When bit 26 of rB is a zero, a mask of n zeros followed by 32-n ones is generated. 

When bit 26 of rB is a one, a mask of all zeros is generated. 

The logical AND of the rotated word and the generated mask is placed in rA. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

Note: This instruction is specific to the MPC601. 
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srwA^ 

Shift Right Word 

srw rA,rS,rB 

srw. rA,rS,rB 

[POWER mnemonics: sr, sn] 



(Rc=()) 
(Rc=l) 



srwx 

Integer Unit 



31 


S 


A 


B 


536 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



m-rB[27-311 
rA<-Rf )TL(rS, 32-n) 

If rB[26]={), the contents of rA are shifted right the number of bits specitied by rA[27-3 1]. 
Bits shifted out of position 31 are lost. Zeros are supplied to the vacated positions on the 
left. The 32-bit result is placed into rA. 

If rB[26]=l, then rA is tilled with zeros. 

Other registers altered: 

• Condition Register (CRO Field): 
' Affected: LT, GT, EQ, SO (if Rc=l ) 
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stb 

Store Byte 



stb 



rS,d(rA) 



stb 

Integer Unit 



38 


S 


A 


d 



5 6 



10 11 



15 16 



31 



if rA = then b<-0 
else b<— (rA) 

EA^b + EXTS(d) 
MEM(EA, l)<-rS[24-31] 

EA is the sum (rAIO)+d. Register rS[24-31] is stored into the byte in memory addressed 
by EA. Register rS is unchanged. 

Other registers altered: 
• None 
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stbu 

Store Byte with Update 



stbu 



rS,cl(rA) 



Stbu 

Integer Unit 



39 


S 


A 


d 



5 6 



10 11 



15 16 



31 



if rA = then b<— 
else b<— (rA) 

EA<r-b + EXTS(d) 
MEM(EA, l)^rS[24-31] 
rAf-EA 

EA is the sum (rAI())+d. Register rS[24-31] is stored into the byte in memory addressed 
byEA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=() as shown above. 

Other registers altered: 
• None 
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stbux 

Store Byte with Update Indexed 



stbux 



rS,rA,rB 



Stbux 

Integer Unit 



5 6 



10 11 



15 16 



21 22 



liij Reserved 



31 


S 


A 


B 


247 


lit 



30 31 



if r A = then b<-0 
else b<— (rA) 

EA4-b + (rB) 
MEM(EA, l)<-rS[24-311 
rA<-EA 

EA is the sum (rAI())+(rB). Register rS[24-31] is stored into the byte in memory addressed 
byEA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-177 



stbx 

Store Byte Indexed 



stbx 



rS,rA,rB 



Stbx 

Inteaer Unit 



[ill Reserved 



31 


S 


A 


B 


215 


III 



5 6 



10 11 



15 16 



21 22 



30 31 



if rA=Othenb<-0 
else bf-(rA) 

EA<-b + (rB) 
EM(EA, l)^rS[24-31] 

EA is the sum (rAI())+(rB). Register rS[24-31 ] is stored into the byte in memory addressed 
by EA. Register rS is unchanged. 

Other registers altered: 
• None 
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stfd 

Store Floating-Point Double-Precision 



stfd 



frS,d(rA) 



5 6 



10 11 



15 16 



Stfd 

Floating-Point Unit 



54 


frS 


A 


d 



30 31 



if rA = then b<-0 
else b<— (rA) 

EA<-b + EXTS(d) 
MEM(EA, 8)<-(frS) 

EA is the sum (rAlO) + d. 

The contents of register frS is stored into the double word in memory addressed by EA. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-179 



stfdu 

Store Floating-Point Double-Precision with Update 



stfdu 



frS,d(rA) 



Stfdu 

Floating-Point Unit 



55 


frS 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA=Othenb«-0 

else b<— (rA) 

EA4-b + d 

MEM(EA, 4)<- SINGLE(frS) 

rA<-EA 

EA is the sum (rAlO) + d. 

The contents of register frS is stored into the double word in memory addressed by EA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC6()1 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 
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stfdux 

Store Floating-Point Double-Precision with Update Indexed 
stfdux frS,rA,rB 



5 6 



10 11 



15 16 



20 21 



Stfdux 

Floating-Point Unit 



HI Reserved 



31 


frS 


A 


B 


759 


ill 



30 31 



if rA=Othenb<-0 
else b<-(rA) 

EA^b + (rB) 
MEM(EA, 8)<-(frS) 
rA<-EA 

EAis the sum (rAlO) + (rB). 

The contents of register frS is stored into the double word in memory addressed by EA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=0, the 
MPC601 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-181 



stfdx 



Store Floating-Point Double-Precision Indexed 
stfdx frS,rA,rB 



5 6 



10 11 



15 16 



20 21 



Stfdx 

Floating-Point Unit 



lij Reserved 



31 


frS 


A 


B 


727 


III 



30 31 



if rA + then b <-0 
else b4— (rA) 

EA<-b + (rB) 
MEM(EA. 8)<-(frS) 

EA is the sum (rAlO) -f- (rB). 

The contents of register frS is stored into the double word in memory addressed by EA. 

Other registers altered: 
• None 
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stfs 

Store Floating-Point Single-Precision 



stfs 



frS,d(rA) 



stfs 

Integer Unit and Floating-Point Unit 



52 


frS 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA=Otheiib<-0 
else b<— (rA) 

EA<-b + EXTS(d) 
MEM(EA, 4)<-SINGLE(frS) 

EA is the sum (rAI())+d. 

The contents of register frS is converted to single-precision and stored into the word in 
memory addressed by EA. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-183 



stfsu stfsu 

Store Floating-Point Single-Precision with Update Integer Unit and Floating-Point Unit 



stfsu 



frS,d(rA) 



53 


frS 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA = Othenbf-() 
else b<— (rA) 

EA<-b + EXTS(d) 
MEM(EA, 4)<-SINGLE(frS) 
rA<-EA 

EA is the sum (rAlO) + d. 

The contents of frS is converted to single-precision and stored into the word in memory 
addressed by EA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=() as shown above. 

Other registers altered: 
• None 
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stfsux 

Store Floating-Point Single-Precision with Update indexed 



stfsux 



frS,rA,rB 



Stfsux 

Integer Unit and 
Floating-Point Unit 



5 6 



10 11 



15 16 



20 21 



lii Reserved 



31 


frS 


A 


B 


695 


jjl 



30 31 



ifrA = Othenb<-0 

else b<— (rA) 

EA<-b + (rB) 

MEM(EA, 4)<-SINGLE(frS) 

rA<-EA 

EA is the sum (rAlO) + (rB). 

The contents of frS is converted to single-precision and stored into the word in memory 
addressed by EA, 

EA is placed into r A. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC6()1 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. instruction Set 



10-185 



stfsx 

store Floating-Point Single-Precision Indexed 



stfsx 



frS,rA,rB 



Stfsx 



Integer Unit and 
Floating-Point Unit 



5 6 



10 11 



15 16 



20 21 



lil Reserved 



31 


frS 


A 


B 


663 


III 



30 31 



if rA=0 then b<-0 
else b4-(rA) 

EA<-b + (rB) 
MEM(EA, 4)<-SINGLE(frS) 

EA is the sum (rAlO) + (rB). 

The contents of register frS is converted to single-precision and stored into the word in 
memory addressed by EA. 

Other registers altered: 
• None 
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sth 

Store Half Word 



sth 



rS,d(rA) 



sth 

Integer Unit 



44 


S 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA=Othenb<-0 
else b<— (rA) 

EA<-b + EXTS(d) 
MEM(EA,2)<-rS[ 16-31) 

EA is the sum (rAlO) + d. Register rS[16-31] is stored into the half word in memory 
addressed by EA. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-187 



sthbrx 

store Half Word Byte-Reverse Indexed 



sthbrx 



rS,rA,rB 



5 6 



10 11 



15 16 



20 21 



Sthbrx 

Integer Unit 



m Reserved 



31 


S 


A 


B 


918 


u 



30 31 



ifrA=Othenb<-0 

else b<— (rA) 

EA<-b + (rB) 

MEM(EA, 2)<-rS[24-31] II rS[ 16-23] 

EA is the sum (rAIO)+(rB). The contents of rS[24-31] are stored into bits ()-7 of the half 
word in memory addressed by EA. Bits rS[16-23] are stored into bits 8-15 of the half word 
in memory addressed by EA. 

Other registers altered: 
• None 
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sthu 

store Half Word with Update 



sthu 



rS,d(rA) 



Sthu 

Integer Unit 



45 


S 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA=Othenb^O 
else b<— (rA) 

EAf-b + EXTS(d) 
MEM(EA,2)<-rS[ 16-31) 
rA4-EA 

EA is the sum (rAIO)+d. The contents of rS[l 6-3 1 ] are stored into the half word in memory 
addressed by EA, 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-189 



sthux 



Store Half Word with Update Indexed 
sthux rS,rA,rB 



5 6 



10 11 



15 16 



20 21 



Sthux 

Integer Unit 



m Reserved 



31 


S 


A 


B 


439 


III 



30 31 



ifrA = Othenb^O 
else b<— (rA) 

EA<-b + (rB) 
MEM(EA,2)4-rS[ 16-31] 
rA<-EA 

EA is the sum (rAI())+(rB). Register rS[16-31] is stored into the half word in memory 
addressed by EA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 
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sthx 

Store Half Word Indexed 



sthx 



rS,rA,rB 



Sthx 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



[ill Reserved 



31 


S 


A 


B 


407 


III 



30 31 



ifrA = Othenb<-0 
else b<-(rA) 

EA^b + (rB) 
MEM(EA,2)<-rS[ 16-31] 

EA is the sum (rAlO) + (rB). Register rS[ 16-31] is stored into the half word in memory 
addressed by EA. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-191 



stmw 

store Multiple Word 

stmw rS,d(rA) 

[POWER mnemonic: stm] 



Stmw 

Integer Unit 



47 


S 


A 


d 



5 6 



10 11 



15 16 



31 



if rA = then b<r-0 

else b<— (rA) 

EA^b + EXTS(d) 

r<-rS 

do while r< 31 

MEM(EA, 4) <- GPR(r) 

rf- r+ 1 

EA<-EA + 4 

EA is the sum (rAlO) + d. 

n = (32 - rS). 

n consecutive words starting at EA are stored from the GPRs rS through 31 . For example, 
if rS=30, 2 words are stored. 

EA must be a multiple of 4; otherwise, the system alignment error handler may be invoiced. 

Other registers altered: 
• None 
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stswi 

Store String Word Immediate 

stswi rS,rA,NB 

[POWER mnemonic: stsi] 



stswi 

Integer Unit 



m Reserved 



31 


S 


A 


NB 


725 


III 



5 6 



10 11 



15 16 



20 21 



30 31 



if rA = then EA<-0 

else EA^(rA) 

ifNB = 0thenn^32 

else n<— NB 

r<-rS-l 

if-0 

do while n>0 

if i = then r<-r+l (mod 32) 

MEM(EA, l)<-GPR(r)[i:i+71 

i<— i+8 

if i = 31 theni<-0 

EA<-EA+1 

n<r-n-l 

EA is (rAlO). Let n = NB if NB^tO, n = 32 if NB=(); n is the number of bytes to store. Let 
nr = CEIL(/2/4): nr is the number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs rS through rS+nr-\. 

Bytes are stored left to right from each register. The sequence of registers wraps around 
through GPRO if required. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-193 



stswx 

Store String Word Indexed 

stswx rS,rA,rB 

[POWER mnemonic: stsx] 



Stswx 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



lilj Reserved 



31 


S 


A 


B 


661 


ill 



30 31 



if rA = then b<— 

else b<— (rA) 

EA<-b+(rB) 

n^XER[25-31] 

r<-rS-l 

i<-0 

do while n>0 

if i = then r<-r+l (mod 32) 

MEM(EA, l)<-GPR(r)[i:i+7] 

i<-i+8 

ifi = 31 theni«-0 

EA<-EA+1 

EA is the sum (rAI{))+(rB). Let n = XER[25-31]; n is the number of bytes to store. 

Let nr - CEIL(«/4): nr is the number of registers to supply data. 

n consecutive bytes starting at EA are stored from GPRs rS through rS+wr-L 

Bytes are stored left to right from each register. The sequence of registers wraps around 
through GPRO if required. 

Other registers altered: 
• None 
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stw 

Store Word 

stw rS,d(rA) 

[POWER mnemonic: st] 



Stw 

Integer Unit 



36 


S 


A 


d 



5 6 



10 11 



15 16 



31 



if rA=Othenb<-0 
else b<-(rA) 

EA<-b + EXTS(d) 
MEM(EA, 4)<-rS 

EA is the sum (rAlO) + d. The contents of rS are stored into the word in memory addressed 
byEA. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-195 



stwbrx 



Store Word Byte-Reverse Indexed 

stwbrx rS,rA,rB 

[POWER mnemonic: stbrx] 



Stwbrx 

Integer Unit 



in Reserved 



31 


S 


A 


B 


662 


9 



5 6 



10 11 



15 16 



20 21 



30 31 



ifrA = ()thenb<-0 

else b^(rA) 

EA<-b + (rB) 

MEM(EA,4)<-rS[24-311 II rSI16-23] II rS[8-15] II rS[0-7] 

EA is the sum (rAI())+(rB). The contents of rS[24-31] are stored into bits 0-7 of the word 
in memory addressed by EA. Bits rS[ 16-23] are stored into bits 8-15 of the word in 
memory addressed by EA. Bits rS[8-15] are stored into bits 16-23 of the word in memory 
addressed by EA. Bits rS[0-7] are stored into bits 24-31 of the word in memory addressed 
byEA. 

Other registers altered: 
• None 
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stwcx. 

Store Word Conditional Indexed 



stwcx. 



rS,rA,rB 



Stwcx. 

Integer Unit 



31 


S 


A 


B 


150 


1 



5 6 



10 11 



15 16 



20 21 



30 31 



ifrA = Othenb<-0 
else b<— (rA) 

EA^b + (rB) 
if RESERVE then 

MEM(EA, 4)<-rS 

RESERVE<-0 

CRO<-ObOO II Ob HI XER[SO] 
else 

CR0<-0B00 II OBO II XER[SO] 

EA is the sum (rAI())+(rB). 

If a reservation exists, the contents of rS are stored into the word in memory addressed by 
EAand the reservation is cleared. If no reservation exists, the instruction completes without 
altering memory. 

CRO Field is set to reflect whether the store operation was performed (i.e., whether a 
reservation existed when the stwcx. instruction commenced execution) as follows. 

CRO[LT GT EQ SO] <-b'()0' II store_performed II XER[SO] 

The EQ bit in the condition register field CRO is modified to reflect whether the store 
operation was performed (i.e., whether a reservation existed when the stwcx. instruction 
began execution). If the store was completed successfully, the EQ bit is set to one. 

EA must be a multiple of 4; otherwise, the system alignment error handler may be invoked 
or the results may be undefined. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



MOTOROLA 



Chapter 10. Instruction Set 



10-197 



stwu 

Store Word with Update 

stwu rS,d(rA) 

[POWER mnemonic: stu] 



stwu 

Integer Unit 



37 


S 


A 


d 



5 6 



10 11 



15 16 



31 



ifrA = Othenb<-0 
else b<-(rA) 

EA<-b + EXTS(d) 
MEM(EA, 4)<-rS 
rAf-EA 

EA is the sum (rAIO)+d. The contents of rS are stored into the word in memory addressed 
byEA. 

EA is placed into rA. 

While the PowerPC architecture defines the instruction form as invalid if rA=(), the 
MPC6()1 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 
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stwux 

Store Word with Update Indexed 

stwux rS,rA,rB 

[POWER mnemonic: stux] 



stwux 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



|i| Reserved 



31 


S 


A 


B 


183 


ill 



30 31 



if rA = then b^O 
else b<-(rA) 

EA<-b + (rB) 
MEM(EA, 4)^rS 
rA<-EA 

EA is the sum (rAIO)+(rB). The contents of rS are stored into the word in memory 
addressed by EA. 

EA is placed into rA. 

While the PowerPC architecture detines the instruction form as invalid if rA=(), the 
MPC601 supports execution with rA=0 as shown above. 

Other registers altered: 
• None 



MOTOROLA 



Chapter 10. Instruction Set 



10-199 



stwx 

Store Word Indexed 

stwx rS,rA,rB 

[POWER mnemonic: stx] 



5 6 



10 11 



15 16 



20 21 



Stwx 

Integer Unit 



m Reserved 



31 


S 


A 


B 


151 


Hi 



30 31 



if rA = then b<-0 
else b<-(rA) 

EA^b + (rB) 
MEM(EA,4)*-rS 

EA is the sum (rAIO)+(rB). The contents of rS are is stored into the word in memory 
addressed by EA. 

Other registers altered: 
• None 
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subfx 

Subtract from 

subf 
subf. 
subfu 
subfo. 



rD,rA,rB 
rD,rA,rB 
rD,rA,rB 
rD,rA,rB 



(OE=0 Rc=()) 
(OE=0 Rc=l) 
(0E=1 Rc=()) 
(0E=1 Rc=l) 



subfx 

Integer Unit 



31 


D 


A 


B 


OE 


40 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD< — (rA) + (rR) + 1 

The sum -(rA) + (rB) +1 is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 

Affected: SO, OV (ifOE=l) 



MOTOROLA 



Chapter 10. Instruction Set 



10-201 



subfc 



Subtract from Carrying 

subfc rD,rA,rB (0E=() Rc=()) 

subfc. rD,rA,rB (OE=()Rc=l) 

subfco rD,rA,rB (0E=1 Rc=0) 

subfco. rD,rA,rB (0E=1 Rc=l) 

[POWER mnemonics: sf, sf., sfo, sfo.] 



subfcx 

Integer Unit 



31 


D 


A 


B 


OE 


8 


Re 







5 6 



10 11 



15 16 



20 21 22 



30 31 



rD< — (r A) + (rB) + 1 

The sum -(rA) + (rB) + 1 is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 
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subfex 




Subtract from Extended 




subfe rD,rA,rB 


(0E=() Rc=()) 


subfe. rD,rA,rB 


(OE=0Rc=l) 


subfeo rD,rA,rB 


(0E=1 Rc=0) 


subfeo. rD,rA,rB 


(0E=1 Rc=l) 



[POWER mnemonics: sfe, sfe., sfeo, sfeo.] 



subfex 

Integer Unit 



31 


D 


A 


B 


OE 


136 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD< — (rA) + (rB) + XER[CA] 

The sum -(rA) + (rB) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 



MOTOROLA 



Chapter 10. Instruction Set 



10-203 



subfic 



Subtract from Immediate Carrying 

subfic rD,rA,SIMM 

[POWER mnemonic: sfi] 



subfic 

Integer Unit 



08 


D 


A 


SIMM 







5 6 



10 11 



15 16 



31 



rDf- -,(rA) + EXTS(SIMM) + 1 

The sum -.(rA) + EXTS(SIMM) + 1 is placed into rD. 

Other registers altered: 
• XER: 

Affected: CA 
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subfmex 

Subtract from Minus One Extended 



subfmex 

Integer Unit 



subfme 


rD,rA 


(0E=() Rc=()) 


subfme. 


rD,rA 


(OE=0Rc=l) 


subfmeo 


rD,rA 


(0E=1 Rc=()) 


subfmeo. 


rD,rA 


(0E=1 Rc=l) 


[POWER 


mnemonics: sfme, sfme., sfmeo, sfmeo 



m Reserved 



31 


D 


A 


OOOOO 


OE 


232 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



rD< — ^(rA) + XER[CAl-l 

The sum -n(rA) + XER[CA] + x 'FFFFFFFF' is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 



MOTOROLA 



Chapter 10. Instruction Set 



10-205 



subfzex subfzex 

Subtract from Zero Extended Integer Unit 



subfze 


rD,rA 


(OE=0 Rc=()) 


subfze. 


rD,rA 


(OE=()Rc=l) 


subfzeo 


rD,rA 


(0E=1 Rc=()) 


subfzeo. 


rD,rA 


(0E=1 Rc=l) 


[POWER 


mnemonics: sfze, sfze. 


, sfzeo, sfzeo.] 



HI Reserved 



31 


D 


A 


00000 


OE 


200 


Re 



5 6 10 11 15 16 20 21 22 30 31 

rD<r- -,(rA) + XER[CA] 

The sum -.(rA) + XER[CA] is placed into rD. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• XER: 
Affected: CA 

Affected: SO, OV (ifOE=l) 
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sync 

Synchronize 

[POWER mnemonic: dcs] 



sync 

Integer Unit 



[ill Reserved 



31 


DOOOO 


00000 


00000 


598 


1:11 



5 6 



10 11 



15 16 



20 21 



30 31 



Tile sync instruction provides an ordering function for tiie effects of all instructions 
executed by a given processor. Executing a sync instruction ensures that all instructions 
previously initiated by the given processor appear to have completed before any subsequent 
instructions are initiated by the given processor. When the sync instruction completes, all 
external accesses initiated by the given processor prior to the sync will have been 
performed with respect to all other mechanisms that access memory. 

The sync instruction can be used to ensure that the results of all stores into a data structure, 
performed in a "critical section" of a program, are seen by other processors before the data 
structure is seen as unlocked. The eieio instruction may be more appropriate than sync for 
cases in which the only requirement is to control the order in which external references are 
seen by I/O devices. 

Other registers altered: 

• None 
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tibie tibie 

Translation Lookaside Buffer Invalidate Entry Integer Unit 

tibie rB 

[POWER mnemonic: tlbi] 

m Reserved 



31 


00000 


00000 


B 


306 


m 



5 6 10 11 15 16 20 21 30 31 

EA<-(rB) 

if UTLR entry exists for EA, then 
UTLB entry^invalid 

EA is the contents of rB. The translation lookaside buffer (referred to as the UTLB) 
containing entries corresponding to the EA are made invalid (i.e., removed from the 
UTLB). Additionally, a TLB invalidate operation is broadcast on the system interface. 

The UTLB search is done regardless of the settings of MSRfIT] and MSR[DT]. 

Block address translation for EA, if any, is ignored. 

If the segment register for EA specifies SR[T]=1 (an I/O controller interface segment), no 
UTLB entry invalidation is performed on the local processor and no TLB invalidate 
operation is broadcast on the system interface. 

Because the MPC601 supports broadcast of TLB entry invalidate operations, then the 
following must be observed: 

• The tibie instruction(s) must be contained in a critical section, controlled by 
software locking, so that tible is issued on only one processor at a time. 

• A sync instruction must be issued after every tibie and at the end of the critical 
section. This causes the hardware to wait for the effects of the preceding tibie 
instructions(s) to propagate to all processors. 

A processor detecting a TLB invalidate broadcast performs the following: 

1. Prevents execution of any new load, store, cache control or tibie instructions and 
prevents any new reference or change bit updates 

2. Waits for completion of any outstanding memory operations (including updates to 
the reference and change bits associated with the entry to be invalidated) 

3. Invalidates the two entries (both associativity classes) in the UTLB indexed by the 
matching address 

4. Resumes normal execution 
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This is a supervisor-level instruction. 

The software must ensure that SDRl points to the page table when issuing tlbie, even when 
address translation is disabled. Nothing is guaranteed about instruction fetching in other 
processors if the tlbie instruction deletes the page in which some other processor is 
currently executing. 

This instruction is optional in the PowerPC architecture. 

Other registers altered: 
• None 
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tw 



Trap Word 

tw TO,rA,rB 

[POWER mnemonic: t] 



tw 

Integer Unit 



m Reserved 



31 


TO 


A 


B 


4 


9 



5 6 



10 11 



15 16 



20 21 



30 31 



a<- EXTS(rA) 
h<- EXTS(rR) 

if(a<b)&TO[01thenTRAP 
if(a>b)&T()[l]thenTRAP 
if(a = b)&T()[2]tIienTRAP 
if (a <U b) & T0[3] then TRAP 
if (a >U b) & T0[4] then TRAP 

The contents of rA are compared with the contents of rB. If any bit in the TO field is set to 
1 and its corresponding condition is met by the result of the comparison, then the system 
trap handler is invoked. 

Other registers altered: 
• None 
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twi twi 

Trap Word Immediate Integer Unit 

twi TO,rA,SIMM 

[POWER mnemonic: ti] 



03 


TO 


A 


SIMM 



5 6 10 11 15 16 31 

a<- EXTS(rA) 

if (a < EXTS(SIMM)) & TO[0] then TRAP 
if (a > EXTS(SIMM)) & T0[ 1 ] then TRAP 
if (a = EXTS(SIMM)) & T0[2] then TRAP 
if (a <U EXTS(SIMM)) & T0[3] then TRAP 
if (a >U EXTS(SIMM)) & T0[4] then TRAP 

The contents of rAare compared with the sign-extended SIMM field. If any bit in the TO 
field is set to 1 and its corresponding condition is met by the result of the comparison, then 
the system trap handler is invoked. 

Other registers altered: 
• None 
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xorx 

XOR 

xor 
xor. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



XOVx 

Integer Unit 



31 


S 


A 


B 


316 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



rA<-(rS) e (rB) 

The contents of rA is XORed with the contents of rB and the result is placed into r A. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 
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xori 

XOR Immediate 

xori rA,rS,UIMM 

[POWER mnemonic: xoril] 







5 6 



10 11 



15 16 



xori 

Integer Unit 



26 


S 


A 


UIMM 



31 



rA<-(rS)e ((16)0 II UIMM) 

The contents of rS is XORed with x'oooo" II UIMM and the result is placed into rA. 

Other registers altered: 
• None 
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xoris 

XOR Immediate Shifted 

xoris rA,rS,UIMM 

[POWER mnemonic: xoriu] 



xoris 

integer Unit 



27 


S 


A 


UIMM 



5 6 



10 11 



15 16 



31 



rA<-(rS)© (UIMM II (16)0) 

The contents of rS is XORed with UIMM II x'OOOO' and the result is placed into rA. 

Other registers altered: 
• None 
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10.3 Instructions Not Implemented by the MPC601 

Table 10-6 provides a list of 32-bit instructions that are not implemented by the MPC6()1, 
and that generate an illegal instruction exception. Refer to Appendix C, "PowerPC 
Instructions Not Implemented in MPC6()1", for a more detailed description of the 
instructions. 

Table 10-6. 32-Bit Instructions Not Implemented by the MPC601 



Mnemonic 


Instruction 


fres 


Floating-Point Reciprocal Estimate Single-Precision 


frsqrte 


Floating-Point Reciprocal Square Root Estimate 


fsel 


Floating-Point Select 


fsqrt 


Floating-Point Square Root 


fsqrts 


Floating-Point Square Root Single-Precision 


mftb 


Move from Time Base 


stfiwx 


Store Floating-Point as Integer Word Indexed 


tibia 


Translation Lookaside Buffer Invalidate All 


tibiex 


Translation Lookaside Buffer Invalidate Entry by Index 


tibsync 


Translation Lookaside Buffer Synchronize 



Table 10-7 provides a Ust of 32-bit SPR encodings that are not implemented by the 
MPC601. 

Table 10-7. 32-Bit SPR Encodings Not Implemented by the MPC601 



SPR 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


284 


01000 


11100 


TB 


Supervisor 


285 


01000 


11101 


TBU 


Supervisor 


536 


10000 


11000 


DBATOU 


Supervisor 


537 


10000 


11001 


DBATOL 


Supervisor 


538 


10000 


11010 


DBAT1U 


Supervisor 


539 


10000 


11011 


DBAT1L 


Supervisor 


540 


10000 


11100 


DBAT2U 


Supervisor 


541 


10000 


11101 


DBAT2L 


Supervisor 


542 


10000 


11110 


DBAT3U 


Supervisor 


543 


10000 


11111 


DBAT3L 


Supervisor 



Table 10-8 provides a Ust of 64-bit instructions that are not implemented by the MPC601, 
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and that generate an illegal instruction exception. Refer to Appendix C, "PowerPC 
Instructions Not Implemented in MPC6()1". 

Table 10-8. 64-Bit Instructions Not Implemented by the MPC601 



Mnemonic 


instruetion 


cntlzd 


Count Leading Zeros Double Word 


divd 


Divide Double Word 


divdu 


Divide Double Word Unsigned 


extsw 


Extend Sign Word 


fcfid 


Floating Convert From Integer Double Word 


fetid 


Floating Convert to Integer Double Word 


fctidz 


Floating Convert to Integer Double Word with Round to Zero 


Id 


Load Double Word 


Idarx 


Load Double Word and Reserve Indexed 


Idu 


Load Double Word with Update 


Idux 


Load Double Word with Update Indexed 


Idx 


Load Double Word Indexed 


Iwa 


Load Word Algebraic 


Iwaux 


Load Word Algebraic with Update Indexed 


iwax 


Load Word Algebraic Indexed 


mulld 


Multiply Low Double Word 


mulhd 


Multiply High Double Word 


mulhdu 


Multiply High Double Word Unsigned 


ridcl 


Rotate Left Double Word then Clear Left 


rider 


Rotate Left Double Word then Clear Right 


ridie 


Rotate Left Double Word Immediate then Clear 


rldicl 


Rotate Left Double Word Immediate then Clear Left 


ridier 


Rotate Left Double Word Immediate then Clear Right 


ridimi 


Rotate Left Double Word Immediate then Mask Insert 


sibia 


SLB Invalidate All 


sibie 


SLB Invalidate Entry 


sibiex 


SLB Invalidate Entry by Index 


Sid 


Shift Left Double Word 


srad 


Shift Right Algebraic Double Word 


sradi 


Shift Right Algebraic Double Word Immediate 
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Table 10-8. 64-Bit Instructions Not Implemented by the MPC601 (Continued) 



Mnemonic 


instruction 


srd 


Shift Right Double Word 


std 


Store Double Word 


stdcx. 


Store Double Word Conditional Indexed 


stdu 


Store Double Word with Update 


stdux 


Store Double Word indexed with Update 


stdx 


Store Double Word Indexed 


td 


Trap Double Word 


tdi 


Trap Double Word Immediate 



Table 10-9 provides the 64-bit SPR encoding that is not implemented by the MPC601 . 
Table 10-9. 64-Blt SPR Encoding Not Implemented by the MPC601 



SPR 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


280 


01000 


11000 


ASR 


Supervisor 
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Appendix A 
Instruction Set Listings 



This appendix lists the instruction set implemented in the MPC601 and the additional 
PowerPC instructions not implemented in the MPC60, first sorted by mnemonic and then 
by opcode. 

A.1 Complete Instruction List Sorted by Mnemonic 

Table A- 1 lists the instructions implemented in the MPC601 plus those defined in the 
PowerPC architecture that are not implemented in the MPC601 in alphabetical order by 
mnemonic. 1 



Table A-1. Complete Instruction List Sorted by Mnemonic 



Key: 



Reserved bits 



Instruction not implemented in the MPC601 



Name 

absx 

addx 
addcx 
addex 

addi 

addic 

addic. 

addis 

addmex 

addzex 

andx 
andcx 

andi. 
andis. 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



31 


D 


A 


OQOOO 


OE 


360 


Re 


31 


D 


A 


B 


OE 


266 


Re 


31 


D 


A 


B 


OE 


10 


Re 


31 


D 


A 


B 


OE 


138 


Re 


14 


D 


A 


SIMM 


12 


D 


A 


SIMM 


13 


D 


A 


SIMM 


15 


D 


A 


SIMM 


31 


D 


A 


00000 


OE 


234 


Re 


31 


D 


A 


OQOOO 


OE 


202 


Re 


31 


S 


A 


B 


28 


Re 


31 


S 


A 


B 


60 


Re 


28 


S 


A 


UIMM 


29 


S 


A 


UIMM 
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A-1 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



bx 


18 


LI 


AALK 


bcx 


16 


BO 


61 


BD 


AALK 


bcctrx 


19 


BO 


Bl 


OOOOO 


528 


LK 


bclrx 


19 


BO 


Bl 


OOOOO 


16 


LK 


cics 


31 


D 


A 


OOOOO 


531 


Re 


cmp 


31 


crfD 


M 


L 


A 


B 


OOOOOOOOOO 





cmpi 


11 


crfD 


11 


L 


A 


SIMM 


cmpi 


31 


crfD 


;[i 


L 


A 


B 


32 


Q 


cmpli 


10 


crfD 


:|i 


L 


A 


UIMM 


«\il2d 


31 


S 


A 


OOOOO 


11 


58 


II 


Re 


cntlzwA^ 


31 


S 


A 


OOOOO 


26 


Re 


crand 


19 


crbD 


crbA 


crbB 


257 





crandc 


19 


crbD 


crbA 


crbB 


129 





creqv 


19 


crbD 


crbA 


crbB 


289 





crnand 


19 


crbD 


crbA 


crbB 


225 





crnor 


19 


crbD 


crbA 


crbB 


33 





cror 


19 


crbD 


crbA 


crbB 


449 





crorc 


19 


crbD 


crbA 


crbB 


417 





crxor 


19 


crbD 


crbA 


crbB 


193 





dcbf 


31 


OOOOO 


A 


B 


86 





dcbi 


31 


00000 


A 


B 


470 





dcbst 


31 


OOOOO 


A 


B 


54 





debt 


31 


OOOOO 


A 


B 


278 





dcbtst 


31 


OOOOO 


A 


B 


246 





dcbz 


31 


OOOOO 


A 


B 


1014 





divx 


31 


D 


A 


B 


OE 


331 


Re 


am 


^i«»iii« 
liiiiiiiii»iiii 


D 


A . 


B 


oe 


iillB^BIllli 


:f5t 


divdu 




D 


A . 


B 


OB 


liiiiiiiiiiiiiHiiiii 


mc 


divsx 


31 


D 


A 


B 


OE 


363 


Re 


divwx 


31 


D 


A 


B 


OE 


491 


Re 


divwux 


31 


D 


A 


B 


OE 


459 


Re 


doz;f 


31 


D 


A 


B 


OE 


264 


Re 


dozi 


9 


D 


A 


SIMM 
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Name 
eciwx 
ecowx 
eieio 
eqvx 
extsbx 
extsh;f 

fabsx 

faddx 

faddsx 

fcfid 

tempo 

fcmpu 

fetid 

fctjcfe 

fctiwx 

fctiwzx 

fdivx 

fdivsx 

fmaddx 

fmaddsx 

fmrx 

fmsubx 

fmsubsx 

fmulx 

fmulsx 

fnabsx 

fnegx 

fnmaddx 

fnmaddsx 

fnmsubx 

fnmsubsx 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



31 


D 


A 


B 


310 





31 


S 


A 


B 


438 





31 


lllllliilllll 


lilii;lllii|;| 


00000 


854 





31 


s 


A 


B 


284 


Re 


31 


s 


A 


00000 


954 


Re 


31 


s 


A 


00000 


922 


Re 


lllllllllll 


liiiiiii 


mm,m=mm. 


00000 


98e 


Re 


63 


D 


OQQOD 


B 


264 


Re 


63 


D 


A 


B 


i;|ii|||i;:Q|ii 


21 


Re 


59 


D 


A 


B 


11111111111 


21 


Re 


6S 


iiiiiiijii 


iiiiiiiiii 


Iilllllll 


llllllilllllllillllllil 


Re 


63 


crfD 


00 


A 


B 


32 





63 


crfD 


00 


A 


B 


|illl|ii|iiplillil|;|;lllll 





€3 


iilllllll 


lllllliiilill 


ililiiiii 


iiiiiiiiiiiiiiiiiiiiiii 


Be 


63 


llllllili: 


;iiiili|pii; 


llllljllllj 


^iiiliiiiiiiiiiiiiiiiiii 


Re 


63 


D 


llllllllllll 


B 


14 




63 


D 


||||i||i|||;; 


B 


15 


Re 


63 


D 


A 


B 


OQQOO 


18 


Re 


59 


D 


A 


B 


OOOOO 


18 


Re 


63 


D 


A 


B 


C 


29 


Re 


59 


D 


A 


B 


C 


29 


Re 


63 


D 


00000 


B 


72 


Re 


63 


D 


A 


B 


C 


28 


Re 


59 


D 


A 


B 


C 


28 


Re 


63 


D 


A 


00000 


C 


25 


Re 


59 


D 


A 


00000 


C 


25 


Re 


63 


D 


00000 


B 


136 


Re 


63 


D 


00000 


B 


40 


Re 


63 


D 


A 


B 


C 


31 


Re 


59 


D 


A 


B 


c 


31 


Re 


63 


D 


A 


B 


c 


30 


Re 


59 


D 


A 


B 


c 


30 


Re 


5^ 


m 


OOOOO 


IrB 


OOOOO 


Iilllllll 


;R<? 



MOTOROLA 



Appendix A. Instruction Set Listings 



A-3 



Name 

frsp;f 
tr*qrt$ 

feqrtisr: 

fsqrts 

fsub;r 

fsubsx 

icbi 

isync 

Ibz 

Ibzu 
Ibzux 

Ibzx 

{d 

tdarx 

IrfU 

fdux 

Wx 

Ifdu 
Ifdux 

Ifdx 
Ifs 

Ifsu 
Ifsux 

Ifsx 
Iha 

Ihau 
Ihaux 

Ihax 

Ihbrx 

Ihz 

Ihzu 
Ihzux 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



63 


D 


00000 


B 


12 


Re 


63 


illlllilB 


OOOOO 


m 


00000 


IBBilli 


B 


m 


liiiipiiii: 

mmiimmmmmi- 


frA 


i(B 


« 


iBBIil 


n 


m 


iliiBiili 


00000 


m 


00000 


piiii^iiiiii 


ii 


S9 


iiiilliiiij^iis 


Of^OOtl 


m 


ooooo 


iiiiiiftliiiii 


i 


63 


D 


A 


B 


00000 


20 


Re 


59 


D 


A 


B 


00000 


20 


Re 


31 


00000 


A 


B 


982 


||; 


19 


00000 


00000 


00000 


150 


11 


34 


D 


A 


d 


35 


D 


A 


d 


31 


D 


A 


B 


119 


is 


31 


D 


A 


B 


87 


ill 


56 


IBiBiHi 




iiiiliiliii 


ds 


liiiii 


11 


Bi 


iilliilll 


■■ill 


IIIBIII 


64 


lllllll 


i 


58 


111111111 


■iiiii 


llllilill 


Cfe 


iiiiiiiii 


il 


3t 


iiiiili 


IBiilli 


■■III 


53 


lllllll 


Ij 


31 


llllilill 


iiHiiiiiiii 


lilHIIII 


2t 


lllllll 


m 


51 


D 


A 


d 


31 


D 


A 


B 


631 


iJ 


31 


D 


A 


B 


599 


il 


48 


D 


A 


d 


49 





A 


d 


31 


D 


A 


B 


567 


il 


31 


D 


A 


B 


535 


s 


42 


D 


A 


d 


43 


D 


A 


d 


31 


D 


A 


B 


375 





31 


D 


A 


B 


343 





31 


D 


A 


B 


790 





40 


D 


A 


d 


41 


D 


A 


d 


31 


D 


A 


B 


311 


III 
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Name 

Ihzx 

Imw 

Iscbxx 

Iswi 

Iswx 

(wa 

Iwarx 

twax 

Iwbrx 

Iwz 

Iwzu 

Iwzux 

Iwzx 

maskgx 

maskirx 

mcrf 

mcrfs 

mcrxr 

mfcr 

mffsx 

mfmsr 

mfspr 

mfsr 

mfsrin 

mftb; 

mtcrf 

mtfsbOx 

mtfsblx 

mtfsfx 

mtfsfix 

mtmsr 

mtspr 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



31 


D 


A 


B 


279 





46 


D 


A 


d 


31 


D 


A 


B 


277 


Rc 


31 


D 


A 


NB 


597 


&■ 


31 


D 


A 


B 


533 


w 


58 


||||||H|||j| 


A 


IlilllllllliiB^^ 


II 


31 


D 


A 


B 


20 


M 


31 


llllllli 


A 


IIIBIII 


iiiiiiiiSiiiiiiii 


ji 


3$ 


iiiiiliii 


. A 


IIIIIIIIIII 


lllllljllllllllllH^^^ 


III 


31 


D 


A 


B 


534 


Wi 


32 


D 


A 


d 


33 


D 


A 


d 


31 


D 


A 


B 


55 


m 


31 


D 


A 


B 


23 


m 


31 


S 


A 


B 


29 


Rc 


31 


S 


A 


B 


541 


Rc 


19 


crfD 


lllli; 


crfS 


;|ii| 


00000 


OOOOOOQDOO 


Wi 


63 


crfD 


iiit 


crfS 


■iiii 


00000 


64 


M 


31 


criS 


iiii 






00000 


512 


M 


OOQOO 


31 


D 


:;;||;iiiil|;;|:;;| 


00000 


19 


§1 


63 


D 


iiiiiiiiiii 


OOODO 


583 


Rc 


31 


D 


|iii|ill;illl 


OQOOO 


83 


|i; 


31 


^ 


SPR 


339 


li 


31 


D 


Wl 


SR 


liiilli;i:ii|i 


595 


|i; 


31 


D 


iiiiiiiiiii; 


B 


659 


lis 


lllllilllll 


III O 


IIIIBlHIIIilll 


IlllllBlllllljillll 


11 


31 


s 


Im 


CRM 


;i|| 


144 


ill 


63 


crbD 


||i;|||||:i|: 


00000 


70 


Rc 


63 


crbD 


;|||ii|il|||i 


00000 


38 


Rc 


31 





FM 


if 


frB 


711 


Rc 


' 63 


crbD 


iiil 


iiiiiiiiiii 


IMM 


III 


134 


Rc 


31 


s 


'||ii||i||ii 


OQOOO 


146 





31 


D 


SPR 


467 
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Name 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



mlsr 


31 


S 


Q 


SR 


ooooo 


210 





mtsrin 


31 


S 


OOOOO 


B 


242 





mulx 


31 


D 


A 


B 


OE 


107 


Re 


muthd 


31 





A 


6 





73 






ftc 


flnulhdu; 


SI 





A 


e 





^ 






Re 


mulhw;f 


31 


D 


A 


B 





75 


Re 


tnulhwux 


31 


D 


A 


B 


i;s 


11 


Re 


muwd 


3t 


D 


A 


■iliiii 


9 


IIIIHB 


iliiii 




l^C 


mullWA' 


31 


D 


A 


B 


OE 


235 


Re 


mulli 


7 


D 


A 


SIMM 


nabsx 


31 


D 


A 


ooooo 


od 


488 


Re 


nandA^ 


31 


S 


A 


B 


476 


Re 


negx 


31 


D 


A 


ooooo 


OE 


104 


Re 


norx 


31 


S 


A 


B 


124 


Re 


or AT 


31 


S 


A 


B 


444 


Re 


orcx 


31 


S 


A 


B 


412 


Re 


ori 


24 


S 


A 


UIMM 


oris 


25 


S 


A 


UIMM 


rfi 


19 


:||||i|i||||i 


;|||i;i;|i;i||| 


ooooo 


50 





ridel 


■■■■III 


■■1^ 


iiiiiiiiiiiii 


8 


MB 


^ 


Re 


«?ter 


IBIIiliili 


::o:^;:;:o:::::^;;^;SiS:x:S>::::::::::::::::::: 

iiiiiiiilii 


■HHBIIi 


5 


ME 


Bill 


Re 


fldlc 


^■llM 


liiilliiiB 


iiiillwliiiill 


SH 


m 


IIHII 


SH: 


R<J 


rtdfct 


BBiiMiliB 


Biiiiiiiil 


IBIiBiiiii 


m 


,., 

m 


lllll 


SW 


nc 


rJdicr 


■■■ii 


iiiiiiiiiii 


iHIIBiii 


BH 


m 


!■! 


$H 


fie 


rtdimi 


■liMiiiii 


ilBH^BI 


■^^■1 


SH 


UB 


iiiipllii 


SH 


Re 


rlwimi;f 


20 


s 


A 


SH 


MB 


ME 


Re 


rlwinmx 


21 


s 


A 


SH 


MB 


ME 


Re 


rlwnmx 


23 


s 


A 


B 


MB 


ME 


Re 


rribx 


31 


s 


A 


B 


537 


Re 


sc 


17 


ooooo 


ooooo 


ooooooooooooooo 


1 





${bia 


31 


0000{) 


OO0O<J 


OOOOO 


49& 







$lb)^ 


m 


OOQOii 


ooooo 


B 


434 





albfex 


31 


QQQQQ 


ooooo 


s 


466 
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Name o 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



srd 



slex 



sleqx 
sliq^f 
slliqx 
sllqx 
slqx 



slwx 

srad 

sr^di 

sraq;f 

sraiqx 

sraw;^ 

srawix 

srd 

Name 

srex 

sreax 

sreqx 

srd 

sriqx 

sriiqx 

sriqx 

srqx 

srwx 

stb 

stbu 

stbux 

stbx 

&td 

stdcx. 

^du 

$tdax 



3J 


;|||||||j| 


IIIIIIIII 


IIIIJIIII 


llllllllllllllllll 


Re 


31 


s 


A 


B 


153 


Re 


31 


s 


A 


B 


217 


Re 


31 


s 


A 


SH 


184 


Re 


31 


s 


A 


SH 


248 


Re 


31 


s 


A 


B 


216 


Re 


31 


s 


A 


B 


152 


Re 


31 


s 


A 


B 


24 


Re 


31 


s 


A 


B 


794 


Re 


31 


llllillll 


■■■■■I 


■iilllll 


413 


II 


Re 


31 


s 


A 


B 


920 


Re 


31 


s 


A 


SH 


952 


Re 


31 


s 


A 


B 


792 


Re 


31 


s 


A 


SH 


824 


Re 


31 


s 


A 


B 


539 


Re 


6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 


31 


s 


A 


B 


665 


Re 


31 


s 


A 


B 


921 


Re 


31 


s 


A 


B 


729 


Re 


3? 


lllllll 


iiiBiii 


■■■■■i 


ililllillllll^ 


m 


31 


s 


A 


SH 


696 


Re 


31 


s 


A 


SH 


760 


Re 


31 


s 


A 


B 


728 


Re 


31 


s 


A 


B 


664 


Re 


31 


s 


A 


B 


536 


Re 


38 


s 


A 


d 


39 


s 


A 


d 


31 


s 


A 


B 


247 


M 


31 


s 


A 


B 


215 


Wl 


62 


liiiiiii 


liiBllll 


tfe 


OO 


3i 


■lllllll 


HiMli 


S 


214 


;1 


^2 


iiiiiiiii 


Illiliiii 


^ 


■I 


31 


illiliiii 


■llilill 


8 


181 






MOTOROLA 



Appendix A. Instruction Set Listings 



A-7 



Name 

stdx 

stfd 

stfdu 

stfdux 

stfdx 

stfiwx 

stfs 

stfsu 

stfsux 

stfsx 

sth 

sthbrx 

sthu 

sthux 

sthx 

stmw 

stswi 

stswx 

stw 

stwbrx 

stwcx. 

stwu 

stwux 

stwx 

subfx 

subfcx 

subfex 

subfic 

subfmex 

subizex 

sync 

td 

tdi 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



3? 


iiiiiiiiBi 


llllljllj 


iiiiiiiiiil 


iiiiiiiBiiiiiis 


11 


54 


frS 


A 


d 


55 


frS 


A 


d 


31 


frS 


A 


B 


759 


M 


31 


frS 


A 


B 


727 


ii 


SI 


iill^lili 


lilHllI 


iiii::ilft:liiiii; 


IBllBilililllllii 


m 


52 


frS 


A 


d 


53 


frS 


A 


d 


31 


frS 


A 


B 


695 


Wi 


31 


frS 


A 


B 


663 


l§ 


44 


S 


A 


d 


31 


S 


A 


B 


918 


:|| 


45 


S 


A 


d 


31 


S 


A 


B 


439 


if 


31 


S 


A 


B 


407 


M 


47 


S 


A 


d 


31 


S 


A 


NB 


725 


ill 


31 


S 


A 


B 


661 


m 


36 


s 


A 


d 


31 


s 


A 


B 


662 


11: 


31 


s 


A 


B 


150 


1 


37 


s 


A 


d 


31 


s 


A 


B 


183 


Is 


31 


s 


A 


B 


151 


11 


31 


D 


A 


B 


OE 


40 


Re 


31 


D 


A 


B 


OE 


8 


Re 


31 


D 


A 


B 


OE 


136 


Re 


08 


D 


A 


SIMM 


31 


D 


A 


00000 


OE 


232 


Re 


31 


D 


A 


00000 


OE 


200 


Re 


31 


00000 






598 





00000 


00000 


31 


TO 


A 


B 


^8 





02 


TO 


A 


SMM 
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Name 

tibia 

tibie 

llt)?ex 

tibsync 

tw 

twi 

xorx 

xori 

xoris 



6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 



■iiiiiii 


OOOOtl 


OQDOO 


i||iii:;ii||| 


lllllliilillH^^ 


11 


31 


00000 


00000 


B 


306 


m 


Bllllilllll 


00000 


00000 


■■iilll 


lillilllHIilillll 


III 


lllliilB 


00000 


OOOOO 


i|iliiilllll 


iiiiiiiiiiiiiiiii 


11 


31 


TO 


A 


B 


4 


il 


03 


TO 


A 


SIMM 


31 


S 


A 


B 


316 


Re 


26 


S 


A 


UIMM 


27 


S 


A 


UIMM 
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A.2 PowerPC Instruction List Sorted by Opcode 

Table A-2 lists only the instructions defined in the PowerPC architecture that are 
implemented in the MPC601 in numeric order by opcode. It does not include the MPC6()1 - 
only instructions implemented for POWER architecture compatibility. It also does not 
include the PowerPC instructions not implemented on the MPCGOl . 



Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode 


Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


3 




twi 


Trap Word Immediate 


7 




mulli 


Multiply Low Immediate 


8 




subfic 


Subtract from Immediate Carrying 


10 




cmpli 


Compare Logical Immediate 


11 




cmpi 


Compare Immediate 


12 




addic 


Add Immediate Carrying 


13 




addic. 


Add Immediate Carrying and Record 


14 




add!. 


Add Immediate 


15 




addis 


Add Immediate Shifted 


16 




bcx 


Branch Conditional 


17 


1 


sc 


System Call 


18 




bx 


Branch 


19 





mcrf 


Move Condition Register Field 


19 


16 


bclrx 


Branch Conditional to Link Register 


19 


33 


crnor 


Condition Register NOR 


19 


50 


rfi 


Return from Interrupt 


19 


129 


crandc 


Condition Register AND with Complement 


19 


150 


isync 


Instruction Synchronize 


19 


193 


crxor 


Condition Register XOR 


19 


225 


crnand 


Condition Register NAND 


19 


257 


crand 


Condition Register AND 


19 


289 


creqv 


Condition Register Equivalent 


19 


417 


crorc 


Condition Register OR with Complement 


19 


449 


cror 


Condition Register OR 


19 


528 


bcctrx 


Branch Conditional to Count Register 


20 




rlwimix 


Rotate Left Word Immediate then AND with Mask Insert 


21 




rlwinmx 


Rotate Left Word Immediate then AND with Mask 
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Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


23 




rlwnmx 


Rotate Left Word then AND with Mask 


24 




ori 


OR Immediate 


25 




oris 


OR Immediate Shifted 


26 




xori 


XOR Immediate 


27 




xoris 


XOR Immediate Shifted 


28 




andi. 


AND Immediate 


29 




andis. 


AND Immediate Shifted 


31 





cmp 


Compare 


31 


4 


tw 


Trap Word 


31 


8 


subfcx 


Subtract from Carrying 


31 


10 


addcx 


Add Carrying 


31 


11 


mulhwux 


Multiply High Word Unsigned 


31 


19 


mfcr 


Move from Condition Register 


31 


20 


Iwarx 


Load Word And Reserve Indexed 


31 


23 


Iwzx 


Load Word and Zero Indexed 


31 


24 


slwx 


Shift Left Word 


31 


26 


cntlzwx 


Count Leading Zeros Word 


31 


28 


andx 


AND 


31 


32 


cmpi 


Compare Logical 


31 


40 


subfx 


Subtract from 


31 


54 


dcbst 


Data Cache Block Store 


31 


55 


iwzux 


Load Word and Zero with Update Indexed 


31 


60 


andcx 


AND with Complement 


31 


75 


mulhw[.] 


Multiply High Word 


31 


83 


mfmsr 


Move from Machine State Register 


31 


86 


dcbf 


Data Cache Block Flush 


31 


87 


Ibzx 


Load Byte and Zero Indexed 


31 


104 


negx 


Negate 


31 


115 


mfpmr 


Move from Program Mode Register 


31 


119 


Ibzux 


Load Byte and Zero with Update Indexed 


31 


124 


norx 


NOR 


31 


136 


subfex 


Subtract from Extended 
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Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


instruction 


31 


138 


addex 


Add Extended 


31 


144 


mtcrf 


Move to Condition Register Fields 


31 


146 


mtmsr 


Move to Maciiine State Register ^ 


31 


150 


stwcx. 


Store Word Conditional Indexed 


31 


151 


stwx 


Store Word Indexed 


31 


178 


mtpmr 


Move to Program Mode Register 


31 


183 


stwux 


Store Word with Update Indexed 


31 


200 


subfze^^ 


Subtract from Zero Extended 


31 


202 


addzex 


Add to Zero Extended 


31 


210 


mtsr 


Move to Segment Register 


31 


215 


stbx 


Store Byte Indexed 


31 


232 


subfmex 


Subtract from Minus One Extended 


31 


234 


addmeA' 


Add to Minus One Extended 


31 


235 


mulix 


Multiply Low 


31 


242 


mtsrin 


Move to Segment Register Indirect 


31 


246 


dcbtst 


Data Cache Block Touch for Store 


31 


247 


stbux 


Store Byte with Update Indexed 


31 


266 


add;r 


Add 


31 


275 


mftb 


Move from Time Base 


31 


278 


debt 


Data Cache Block Touch 


31 


279 


Ihzx 


Load Halfword and Zero Indexed 


31 


284 


eqv;f 


Equivalent 


31 


306 


tibie 


TLB Invalidate Entry 


31 


307 


mftbu 


Move from Time Base Upper 


31 


310 


eciwx 


External Control Input Word indexed 


31 


311 


Ihzux 


Load Halfword and Zero with Update Indexed 


31 


316 


xorx 


XOR 


31 


339 


mfspr 


Move from Special Purpose Register 


31 


343 


Ihax 


Load Halfword Algebraic Indexed 


31 


375 


Ihaux 


Load Halfword Algebraic with Update Indexed 


31 


403 


mttb 


Move to Time Base ^ 


31 


407 


stiix 


Store Halfword Indexed 
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Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


31 


412 


orcx 


OR with Complement 


31 


434 


sibia 


SLB Invalidate Entry 


31 


435 


mttbu 


Move to Time Base Upper 


31 


438 


ecowx 


External Control Output Word indexed 


31 


439 


sthux 


Store Haifword with Update Indexed 


31 


444 


orx 


OR 


31 


459 


divwu;f 


Divide Word Unsigned 


31 


466 


sibiex 


SLB Invalidate Entry by Index 


31 


467 


mtspr 


Move to Special Purpose Register 


31 


470 


dcbi 


Data Cache Block invalidate 


31 


476 


nandx 


NAND 


31 


491 


divwx 


Divide Word 


31 


498 


sibia 


SLB Invalidate Ail 


31 


512 


mcrxr 


Move to Condition Register from XER 


31 


533 


Iswx 


Load String Word indexed 


31 


534 


Iwbrx 


Load Word Byte-Reverse Indexed 


31 


535 


Ifsx 


Load Floating-Point Single-Precision indexed 


31 


536 


srwx 


Shift Right Word 


31 


567 


Ifsux 


Load Floating-Point Single-Precision with Update Indexed 


31 


595 


mfsr 


Move from Segment Register 


31 


597 


Iswi 


Load String Word Immediate 


31 


598 


sync 


Synchronize 


31 


599 


Ifdx 


Load Floating-Point Double-Precision Indexed 


31 


631 


Ifdux 


Load Floating-Point Double-Precision with Update Indexed 


31 


659 


mfsrin 


Move from Segment Register Indirect 


31 


661 


stswx 


Store String Word Indexed 


31 


662 


stwbrx 


Store Word Byte-Reverse Indexed 


31 


663 


stfsx 


Store Floating-Point Single-Precision Indexed 


31 


695 


stfsux 


Store Floating-Point Single-Precision with Update Indexed 


31 


725 


stswi 


Store String Word Immediate 


31 


727 


stfdx 


Store Floating-Point Double-Precision Indexed 
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Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


31 


759 


stfdux 


Store Floating-Point Double-Precision with Update 
Indexed 


31 


790 


Ihbrx 


Load Halfword Byte-Reverse Indexed 


31 


792 


srawx 


Shift Right Algebraic Word 


31 


824 


srawi;r 


Shift Right Algebraic Word Immediate 


31 


854 


eieio 


Enforce In-Order Execution of I/O 


31 


918 


stiibrx 


Store Halfword Byte-Reverse Indexed 


31 


922 


extshx 


Extend Sign Halfword 


31 


954 


extsbx 


Extend Sign Byte 


31 


982 


icbi 


Instruction Cache Block Invalidate 


31 


983 


stfiwx 


Store Floating-Point as Integer Word Indexed 


31 


1014 


dcbz 


Data Cache Block set to Zero 


31 




tibia 


TLB Invalidate All 


31 




tibiex 


TLB Invalidate Entry by Index 


32 




Iwz 


Load Word and Zero 


33 




Iwzu 


Load Word and Zero with Update 


34 




ibz 


Load Byte and Zero 


35 




Ibzu 


Load Byte and Zero with Update 


36 




stw 


Store Word 


37 




stwu 


Store Word with Update 


38 




stb 


Store Byte 


39 




stbu 


Store Byte with Update 


40 




Ihz 


Load Halfword and Zero 


41 




Ihzu 


Load Halfword and Zero with Update 


42 




Iha 


Load Halfword Algebraic 


43 




Ihau 


Load Halfword Algebraic with Update 


44 




sth 


Store Halfword 


45 




sthu 


Store Halfword with Update 


46 




Imw 


Load Multiple Word 


47 




stmw 


Store Multiple Word 


48 




Ifs 


Load Floating-Point Single-Precision 


49 




Ifsu 


Load Floating-Point Single-Precision with Update 
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Table A-2. PowerPC Instructions Implemented by MPC601: by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


50 




Ifd 


Load Floating-Point 


51 




Ifdu 


Load Floating-Point Double-Precision witti Update 


52 




stfs 


Store Floating-Point Single-Precision 


53 




stfsu 


Store Floating-Point Single-Precision witii Update 


54 




stfd 


Store Floating-Point Double-Precision 


55 




stfdu 


Store Floating-Point Double-Precision with Update 


59 


18 


fdivsx 


Floating-Point Divide Single-Precision 


59 


20 


fsubsx 


Floating-Point Subtract Single-Precision 


59 


21 


faddsx 


Floating-Point Add Single-Precision 


59 


22 


frsqrts;f 


Floating-Point Square Root Single-Precision 


59 


24 


fresx 


Floating-Point Reciprocal Estimate Single-Precision 


59 


25 


fmulsx 


Floating-Point Multiply Single-Precision 


59 


28 


fmsubsx 


Floating-Point Multiply-Subtract Single-Precision 


59 


29 


fmadds;^ 


Floating-Point Multiply-Add Single-Precision 


59 


30 


fnmsubs;f 


Floating-Point Negative Multiply-Subtract Single-Precision 


59 


31 


fnmaddsx 


Floating-Point Negative Multiply-Add Single-Precision 


63 





fcmpu 


Floating-Point Compare Unordered 


63 


12 


frspx 


Floating-Point Round to Single-Precision 


63 


14 


fctiwx 


Floating-Point Convert to Integer Word 


63 


15 


fctiwzx 


Floating-Point Convert to Integer Word v^fitti Round Toward 
Zero 


63 


18 


fdivx 


Floating-Point Divide 


63 


20 


fsubx 


Floating-Point Subtract 


63 


21 


faddx 


Floating-Point Add 


63 


22 


frsqrtx 


Floating-Point Square Root 


63 


23 


fselx 


Floating-Point Select 


63 


25 


fmulx 


Floating-Point Multiply 


63 


26 


frsqrtex 


Floating-Point Reciprocal Square Root Estimate 


63 


28 


fmsubx 


Floating-Point Multiply-Subtract 


63 


29 


fmaddx 


Floating-Point Multiply-Add 


63 


30 


fnmsubx 


Floating-Point Negative Multiply-Subtract 


63 


31 


fnmaddx 


Floating-Point Negative Multiply-Add 
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Table A-2. PowerPC Instructions Implemented by MPC601 : by Opcode (Continued) 



Primary 
Opcode 


Extended 
Opcode 


Mnemonic 


Instruction 


63 


32 


fcmpo 


Floating-Point Compare Ordered 


63 


38 


mtfsblx 


Move to FPSCR Bit 1 


63 


40 


fnegx 


Floating-Point Negate 


63 


64 


mcrfs 


Move to Condition Register from FPSCR 


63 


70 


mtfsbOx 


Move to FPSCR Bit 


63 


72 


fmrx 


Floating-Point Move Register 


63 


134 


mtfsfix 


Move to FPSCR Field Immediate 


63 


136 


fnabs;f 


Floating-Point Negative Absolute Value 


63 


264 


fabsx 


Floating-Point Absolute Value 


63 


583 


mffs;f 


Move from FPSCR 


63 


711 


mtfsfA^ 


Move to FPSCR Fields 
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Appendix B 

POWER Architecture Cross Reference 

This section identifies the incompatibilities that must be managed in migration from the 
POWER architecture to PowerPC architecture. Some of the incompatibilities can, at least 
in principle, be detected by the processor, which traps and lets software simulate the 
POWER operation. Others cannot be detected by the processor. 

In general, the incompatibilities identified here are those that affect a POWER appUcation 
program. Incompatibilities for instructions that can be used only by POWER system 
programs are not discussed. Note that this section describes incompatibilities with respect 
to the PowerPC architecture in general. The MPC6()1 is more closely compatible with the 
POWER architecture. POWER instructions implemented in the MPC6()1 are listed in 
Table B-4. 

B.1 New Instructions, Formerly Privileged 
Instructions 

Instructions new to PowerPC typically use opcode values (including extended opcode) that 
are illegal in the POWER architecture. A few instrucfions that are privileged in the POWER 
architecture (for example, dclz, called dcbz in the PowerPC architecture) have been made 
non-privileged in the PowerPC architecture. Any POWER program that executes one of 
these now-valid or now-non-privileged instructions, expecfing to cause the system illegal 
instruction error handler (program exception) or the system privileged instruction error 
handler to be invoked, will not execute correctiy on PowerPC processors. 

B.2 Newly Privileged Instructions 

The following instructions are user-level in the POWER architecture but are supervisor- 
level in PowerPC processors. 

• mfmsr 

• mfsr 

B.3 Reserved Bits in Instructions 

These are shown with '/'s in the instruction opcode definitions. In the POWER architecture 
such bits are ignored by the processor. In PowerPC architecture they must be or the 
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instruction form is invalid. In several cases the PowerPC architecture assumes that such bits 
in POWER instructions are indeed 0, The cases include the following: 

• cmpi, cmp, cmpli, and cmpl assume that bit 10 in the POWER instructions is 0. 

• mtspr and tnfspr assume that bits 16-20 in the POWER instructions are 0. 

B.4 Reserved Bits in Registers 

The POWER architecture defines these bits to be when read, and either or 1 when 
written to. In the PowerPC architecture it is implementation-dependent, for each register, 
whether these bits are when read and ignored when written to or are copied from source 
to target when read or written to. 

B.5 Alignment Checic 

The AL bit in the POWER machine-state register, MSR[24], is not supported in the 
PowerPC architecture. The bit is reserved in the PowerPC architecture. The low-order bits 
of the EA are always used. Notice that the value — the normal value for a reserved SPR 
bit — means "ignore the low-order EA bits" in the POWER architecture, and the value 1 
means "use the low-order EA bits." However, MSR[24] is not assigned new meaning in 
PowerPC. 

B.6 Condition Register 

The following instructions specify a field in the CR explicitly (via the BF field) and also 
have the record bit option. In the PowerPC architecture, if Rc=l for these instructions the 
instruction form is invalid. In the POWER architecture, if Rc=l the instructions execute 
normally except as shown in Table B- 1 . 

Table B-1 . Condition Register Settings 



Instruction 


Setting 


cmp 


CRO is undefined if Rc=1 and BF^tO 


cmpl 


CRO is undefined if Rc=1 and BFt^^O 


mcrxr 


CRO is undefined if Rc=1 and BpTtO 


fcmpu 


CR1 is undefined if Rc=1 


fcmpo 


CR1 is undefined if Rc=1 


mcrfs 


CR1 is undefined if Rc=1 and BpTtl 



B.7 Inappropriate Use of LK and Re bits 

For the instructions fisted below, if LK=1 or Rc=l, POWER processors execute the 
instruction normally with the exception of setfing the link register (if LK=1) or the 
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condition register tield or 1 (if Rc=l) to an undefined value. In the PowerPC architecture, 
such instruction forms are invahd. 

PowerPC instruction form is invalid if LK=1: 

• sc (svc in the POWER architecture) 

• Condition register logical instructions 

• mcrf 

• isync (ics in the POWER architecture) 

PowerPC instruction form is invalid if Rc=l : 

• Integer X-form load and store instructions 

• Integer X-form compare instructions 

• X-form trap instruction 

• mtspr, mfspr, mtcrf, mcrxr, mfcr, mtmsr, mfmsr, mtsr, mtsrin, tibi, eciwx, 
ecowx, clcs, mfsr, mfsrin, sync, eieio, icii 

• Floating-point X-form load and store instructions and floating-point compare 
instructions 

• mcrfs 

• dcbz (dclz in the POWER architecture) 

B.8 BO Field 

The POWER architecture shows certain bits in the BO field — used by branch conditional 
instructions — as x without indicating how these bits are to be interpreted. These bits are 
ignored by POWER processors. The PowerPC architecture treats these bits differently, as 
shown in Table B-2. 

Table B-2. Differences in the BO Field 



BO Field 


Description 


BO[0-3] 


The PowerPC architecture shows the bits as z. If it is not cleared, the instruction form is 
invalid. 


B0[4] 


This bit, which is shown as x in the POWER architecture independent of the other four 
bits — is shown in the PowerPC architecture as y. It gives a hint about whether the 
branch is likely to be tal^en. If a POWER program has the wrong value for this bit, the 
program runs correctly but performance may suffer. 



B.9 Branch Conditional to Count Register 

For the case in which the count register is decremented and tested (that is, the case in which 
B0[2]=()), the POWER architecture specifies only that the branch target address is 
undefined, implying that the count register, and the link register (if LK=1), are updated in 
the normal way. The PowerPC architecture considers this instruction form invalid. 
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B.10 System Call/Supervisor Call 

The System Call (sc) instruction in the PowerPC architecture is called Supervisor Call 
(svcj) in the POWER architecture. Differences in implementations are as follows: 

• The POWER architecture provides a version of the Supervisor Call instruction (bit 
30 = 0) that allows instruction fetching to continue at any one of 128 locations. It is 
used for "fast SVCs." The PowerPC architecture provides no such version 

• The POWER architecture provides a version of the Supervisor Call instruction (bits 
30-31 = b' 11 ') that resumes instruction fetching at one location and sets the link 
register to the address of the next instruction. The PowerPC architecture provides no 
such version; if bit 3 1 of the instruction is 1 , the instruction form is invalid. 

• For the POWER architecture, information from the MSR is saved in the count 
register. For the PowerPC architecture, this information is saved in SRRl. 

• The POWER architecture permits bits 1 6-29 of the Supervisor Call instruction to be 
non-zero, while in the PowerPC architecture, such an instruction form is invalid. 

• Bits 16-29of the Supervisor Call instruction are regarded as reserved for the 
POWER architecture. As long as POWER compatibility is required for this 
instruction, bits 16-29 are ignored by the processor. 

• The POWER architecture saves the low-order 1 6 bits of the Supervisor Call 
instruction in the count register; the PowerPC architecture does not save them. 

• The settings of the MSR bits by the system call exception differ between the 
POWER architecture and the PowerPC architecture. 

B.11 Update Forms of Memory Access 

The PowerPC architecture requires that rA not be equal to either rD (integer load only) or 
0. If the restriction is violated, the instruction form is invalid. See Appendix D, "Classes of 
Instructions," for information about invalid instructions. The POWER architecture permits 
these cases and simply avoids saving the EA. 

B.12 Multiple Register Loads 

The PowerPC architecture requires that rA and rB if present in the instruction format, not 
be in the range of registers to be loaded, while the POWER architecture permits this and 
does not alter rA or rB in this case. (The PowerPC architecture restriction applies even if 
rA=0, although there is no obvious benefit to the restriction in this case since rA is not used 
to compute the effective address if rA=0.) If the PowerPC architecture restriction is 
violated, the instruction form is invalid. The instructions affected are listed as follows: 

• Imw (Im in the POWER architecture) 

• Iswi (Isi in the POWER architecture) 

• Iswx (Isx in the POWER architecture) 
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Thus, for example, an Imw instruction that loads all 32 registers is valid in the POWER 
architecture but is an invalid form in the PowerPC architecture. 

B.13 Alignment for Load/Store Multiple 

The PowerPC iirchitecture requires the EA to be word-aligned and yields an alignment 
exception or boundedly undefined results if it is not. The POWER architecture specifies that 
an ahgnment exception occurs (if AL=1). 

B.14 Load String Instructions 

In the PowerPC architecture, an Iswx instruction with zero length leaves the content of rD 
undefined, while in the POWER architecture the corresponding instruction (Isx) does not 
alter rD. 

B.15 Synchronization 

The sync instruction (called dcs in the POWER architecture) causes a much more pervasive 
synchronization in the PowerPC architecture than in the POWER architecture. For more 
information, refer to Chapter 10, "Instruction Set." 

B.16 Move to/from SPR 

Differences in how the Move to/from Special Purpose Register (mtspr and mfspr) 
instructions are as follows: 

• The SPR field is 1 bits long in the PowerPC architecture, but only 5 in POWER 
architecture. 

• The mfspr instruction can be used to read the decrementer (DEC) register in 
problem state (user) mode in the POWER architecture, but only in supervisor state 
in the PowerPC architecture. 

• If the SPR value specified in the instruction is not one of the defined values, the 
PowerPC architecture considers the instruction form invalid. (In user mode, the 
allowed SPR values exclude those accessible only in supervisor mode.) The 
POWER architecture does not alter any architected registers in this case and 
generates a program exception if the instruction is executed in user mode and 
SPRI()]=1. 

For PowerPC processors except the MPC6()1 processor, a program exception is generated 
for an attempt to execute an mtspr or mfspr instruction with SPR1()-4J=0 (which denotes 
the MQ register). Similarly, a program exception is generated for attempts to execute an 
mfspr instruction with SPR[()-4]=6 (which denotes reading the decrementer register in the 
POWER architecture). 
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B.17 Effects of Exceptions on FPSCR Bits FR and PI 

For the following cases, the POWER architecture does not specify how the FR and Fl bits 
are set, while the PowerPC architecture preserves them for illegal operation exceptions 
caused by compare instructions and clears them otherwise. 

• Invalid operation exception (enabled or disabled) 

• Zero divide exception (enabled or disabled) 

• Disabled overflow exception 

B.18 Fioating-Point Store Instructions 

The POWER architecture uses FPSCR[UE] to help determine whether denormalization 
should be done, while the PowerPC architecture does not. Using FPSCRIUE] is in fact 
incorrect: in the PowerPC architecture if FPSCR[UE]=1 and a denormalized single- 
precision number is copied from one memory location to another by means of an Ifs 
instruction followed by an stfs instruction, the two "copies" may not be the same. 

B.19 Move from FPSCR 

The POWER architecture defines the high-order 32 bits of the result of mffs to be x'FFFF 
FFFF'. In the PowerPC architecture they are undefined. 

B.20 Clearing Bytes in the Data Cache 

The dclz instruction of the POWER architecture and the dcbz instruction of the PowerPC 
architecture have the same opcode. However, the functions differ in the following respects. 

• The dclz instruction clears a line; dcbz clears a block (a sector in the MPC601). 

• The dclz instruction saves the EA in rA (if rA^^O); dcbz does not. 

• The dclz instruction is supervisor-level; dcbz is not. 

B.21 Segment Register Instructions 

The definitions of the four segment register instructions (mtsr, mtsrln, mfsr, and tnfsrin) 
differ in two respects between the POWER architecture and the PowerPC architecture. 
Instructions similar to mtsrin and mfsrin are called mtsri and mfsri in the POWER 

architecture. 

Privilege — mfsr and mfsri are problem state instructions in the POWER architecture, 
while mfsr and mfsrin are privileged in the PowerPC architecture. 

Function — the indirect instructions (mtsri and mfsri) in the POWER architecture use an 
rA register in computing the segment register number, and the computed EA is stored into 
rA (if rA^^O and rA?^rD); in the PowerPC architecture mtsrin and mfsrin have no rA field 
and EA is not stored. 
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The mtsr, mtsrin (mtsri), and mfsr instructions have the same opcodes in the PowerPC 
architecture as in the POWER architecture. The mfsri instruction in the POWER 
architecture and the mfsrin instruction in PowerPC architecture have different opcodes. 

B.22 TLB Entry Invalidation 

The tlbi instruction of the POWER architecture and the tibie instruction of the PowerPC 
architecture have the same opcode. However, the functions differ in the following respects. 

• The tlbi instruction computes the EA as (rAlO) + (rB), while tlbie lacks an rA field 
and computes the EA as (rB). 

• The tlbi instruction saves the EA in rA (if rA^^O); tlbie lacks an rA field and does 
not save the EA. 

B.23 Timing Facilities 

This section describes differences between the POWER architecture and the PowerPC 
architecture timer facilities. 

B.23.1 Real-Time Clock 

The MPC6()1 implements a POWER-based RTC. Note that the POWER RTC is not 
supported in the PowerPC architecture. Instead, the PowerPC architecture provides a time 
base (TB). Both the RTC and the time base are 64-bit special purpose registers, but they 
differ in the following respects. 

• The RTC counts seconds and nanoseconds, while the TB counts "ticks." The 
frequency of the RTC is implementation-dependent. 

• The RTC increments discontinuously — 1 is added to RTCU when the value in 
RTCL passes 999_999_999. The TB increments continuously— 1 is added to TBU 
when the value in TBL passes x'FFFF FFFF'. 

• The RTC is written and read by the mtspr and mfspr instructions, using SPR 
numbers that denote the RTCU and RTCD. The TB is written and read by new 
instructions (mttb, mttbu, mftb, and mftbu). 

• The SPR numbers that denote RTCL and RTCU are invalid in the PowerPC 
architecture except the MPC6()1. 

• The RTC is guaranteed to increment at least once in the time required to execute 1 
Add hnmediate (addi) instructions. No analogous guarantee is made for the TB. 

Not all bits of RTCL need be implemented, while all bits of the TB must be 
implemented. 

B.23.2 Decrementer 

The PowerPC architecture DEC register decrements at the same rate that the TB 
increments, while the POWER decrementers decrement every nanosecond (which is the 
same rate that the RTC increments). 
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Not all bits of the POWER DEC need be implemented, while all bits of the PowerPC DEC 
must be implemented. 

B.23.3 Deleted Instructions 

The following instructions are part of the POWER architecture but have been dropped from 
the PowerPC architecture. 

Table B-3. Deleted POWER Instructions 



Mnemonic 


Instruction 


Primary 
Opcode 


Secondary 
Opcode 


In MPC601 
Processor 


abs 


Absolute 


31 


360 


Yes 


cics 


Cache Line Compute Size 


31 


531 


Yes 


elf 


Cache Line Flush 


31 


118 


No 


cli 


Cache Line invalidate 


31 


502 


No 


deist 


Data Cache Line Store 


31 


630 


No 


div 


Divide 


31 


331 


Yes 


divs 


Divide Short 


31 


363 


Yes 


doz 


Difference or Zero 


31 


264 


Yes 


dozi 


Difference or Zero Immediate 


09 


— 


Yes 


Iscbx 


Load String and Compare Byte Indexed 


31 


277 


Yes 


maskg 


Mask Generate 


31 


29 


Yes 


maskir 


Mask Insert from Register 


31 


541 


Yes 


mfsri 


Move from Segment Register Indirect 


31 


627 


Yes 


mul 


Multiply 


31 


107 


Yes 


nabs 


Negative Absolute 


31 


488 


Yes 


rac 


Real Address Compute 


31 


818 


No 


rimi 


Rotate Left then Mask Insert 


22 


— 


Yes 


rrib 


Rotate Right and Insert Bit 


31 


537 


Yes 


sle 


Shift Left Extended 


31 


153 


Yes 


sleq 


Shift Left Extended with MQ 


31 


217 


Yes 


sliq 


Shift Left Immediate with MQ 


31 


184 


Yes 


slllq 


Shift Left Long Immediate with MQ 


31 


248 


Yes 


sliq 


Shift Left Long with MQ 


31 


216 


Yes 


slq 


Shift Left with MQ 


31 


152 


Yes 


sraiq 


Shift Right Algebraic Immediate with MQ 


31 


952 


Yes 


sraq 


Shift Right Algebraic with MQ 


31 


920 


Yes 
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Table B-3. Deleted POWER Instructions (Continued) 



Mnemonic 


Instruction 


Primary 
Opcode 


Secondary 
Opcode 


In MPC601 
Processor 


sre 


Shift Right Extended 


31 


665 


Yes 


srea 


Shift Right Extended Algebraic 


31 


921 


Yes 


sreq 


Shift Right Extended with f\/iQ 


31 


729 


Yes 


sriq 


Shift Right Immediate with MQ 


31 


696 


Yes 


sriiq 


Shift Right Long Immediate with l\/IQ 


31 


760 


Yes 


sriq 


Shift Right Long with MQ 


31 


728 


Yes 


srq 


Shift Right with MQ 


31 


664 


Yes 


svc[l] 


Supervisor Call, with SA=0 


17 





No 



Note: Many of these instructions use the MQ register. The MQ is not defined in the PowerPC architecture, 
but is implemented in the MPC601 processor. 

B.24 POWER Instructions Supported by the MPC601 
Processor 

Table B-4 lists the POWER instructions implemented in the PowerPC architecture. 
Table B-4. POWER Instructions Implemented in PowerPC Architecture 



POWER 


PowerPC 


IVInemonic 


Instruction 


l\Anemonic 


Instruction 


a[o][.] 


Add 


addc[o][.] 


Add Carrying 


ae[o][.] 


Add Extended 


adde[o][.] 




ai 


Add Immediate 


addic 


Add Immediate Carrying 


ai. 


Add Immediate and 
Record 


addic. 


Add Immediate Carrying 
and Record 


ame[o][.] 


Add to Minus One 
Extended 


addme[o][.] 




andil. 


AND Immediate Lower 


andi. 


AND Immediate 


andiu. 


AND Immediate Upper 


andis. 


AND Immediate Shifted 


aze[o][.] 


Add to Zero Extended 


addze[o][.] 




bcc[l] 


Branch Conditional to 
Count Register 


bcctr[l] 




bcr[l] 


Branch Conditional to Link 
Register 


bclr[l] 




cal 


Compute Address Lower 


addi 


Add Immediate 


cau 


Compute Address Upper 


addis 


Add Immediate Shifted 
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Table B-4. POWER Instructions Implemented in PowerPC Architecture (Continued) 



POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


cax[o][.] 


Compute Address 


add[o][.] 


Add 


cntlz[.] 


Count Leading Zeros 


cntlzw[.] 


Count Leading Zeros 
Word 


dcs 


Data Cache Synchronize 


sync 


Synchronize 


exts[.] 


Extend Sign 


extsh[.] 


Extend Sign Half Word 


fa[.] 


Floating Add 


fadd[.] 




fd[.] 


Floating Divide 


fdiv[.] 




fm[.] 


Floating Multiply 


fmul[.] 




fma[.] 


Floating Multiply-Add 


fmadd[.] 




fms[.] 


Floating Multiply-Subtract 


fmsub[.] 




fnma[.] 


Floating Negative 
Multiply-Add 


fnmadd[.] 




fnms[.] 


Floating Negative 
Multiply-Subtract 


fnmsub[.] 




fsl.] 


Floating Subtract 


fsub[.] 




1 


Load 


Iwz 


Load Word and Zero 


Ibrx 


Load Byte-Reverse 
Indexed 


Iwbrx 


Load Word Byte-Reverse 
Indexed 


Im 


Load Multiple 


Imw 


Load Multiple Word 


Isi 


Load String Immediate 


Iswi 


Load String Word 
Immediate 


Isx 


Load String Indexed 


Iswx 


Load String Word Indexed 


lu 


Load with Update 


Iwzu 


Load Word and Zero with 
Update 


lux 


Load with Update Indexed 


Iwzux 


Load Word and Zero with 
Update indexed 


IX 


Load Indexed 


Iwzx 


Load Word and Zero 
Indexed 


mtsri 


Move to Segment 
Register Indirect 


mtsrin 


Move to Segment 
Register Indirect * 


mull 


Multiply Immediate 


mulli 


Multiply Low Immediate 


muIs[o][.] 


Multiply Short 


mu!l[o][.] 


Multiply Low 


oril 


OR Immediate Lower 


ori 


OR Immediate 


oriu 


OR Immediate Upper 


oris 


OR Immediate Shifted 
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Table B-4. POWER Instructions Implemented in PowerPC Architecture (Continued) 



POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


rlimi[.] 


Rotate Left Immediate 
then Mask Insert 


rlwimi[.] 


Rotate Left Word 
Immediate then Mask 
Insert 


rlinm[.] 


Rotate Left Immediate 
then AND With Mask 


rlwinm[.] 


Rotate Left Word 
Immediate then AND with 
Mask 


rlnm[.] 


Rotate Left then AND with 
Mask 


rlwnm[.] 


Rotate Left Word then 
AND with Mask 


sf[o][.] 


Subtract from 


subfc[o][.] 


Subtract from Carrying 


sfe[o][.] 


Subtract from Extended 


subfe[o][.] 




sfi 


Subtract from Immediate 


subfic 


Subtract from Immediate 
Carrying 


sfme[o][.] 


Subtract from Minus One 
Extended 


subfme[o][.] 




sfze[o][.] 


Subtract from Zero 
Extended 


subfze[o][.] 




sl[.] 


Shift Left 


slw[.] 


Shift Left Word 


sr[.] 


Shift Right 


srw[.] 


Shift Right Word 


sra[.] 


Shift Right Algebraic 


sraw[.] 


Shift Right Algebraic Word 


srai[.] 


Shift Right Algebraic 
Immediate 


srawi[.] 


Shift Right Algebraic Word 
Immediate 


St 


Store 


stw 


Store Word 


stbrx 


Store Byte-Reverse 
Indexed 


stwbrx 


Store Word Byte-Reverse 
Indexed 


stm 


Store Multiple 


stmw 


Store Multiple Word 


stsi 


Store String Immediate 


stswi 


Store String Word 
Immediate 


stsx 


Store String Indexed 


stswx 


Store String Word 
Indexed 


stu 


Store with Update 


stwu 


Store Word with Update 


stux 


Store with Update 
Indexed 


stwux 


Store Word with Update 
Indexed 


six 


Store Indexed 


stwx 


Store Word Indexed 


svca 


Supervisor Call 


so 


System Call 


t 


Trap 


tw 


Trap Word 


ti 


Trap Immediate 


twi 


Trap Word Immediate * 


tibi 


TLB Invalidate Entry 


tibie 


TLB Entry Invalidate 
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Table B-4. POWER Instructions Implemented in PowerPC Architecture (Continued) 



POWER 


PowerPC 


Mnemonic 


Instruction 


Mnemonic 


Instruction 


xoril 


XOR Immediate Lower 


xori 


XOR Immediate 


xoriu 


XOR Immediate Upper 


xoris 


XOR Immediate Shifted 



* Supervisor-level instruction 
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Appendix C 

PowerPC Instructions Not Implemented 

in MPC601 



This appendix provides a list of 32-bit and 64-bit instructions that are not implemented by 
the MPC601, and that generate an illegal instruction exception. It also provides the 32-bit 
and 64-bit SPR encodings that are not implemented by the MPC601 . 

See Table C-1 for a list of the 32-bit instructions not implemented by the MPC601 . 

Table C-1. 32-Bit Instructions Not Implemented by the MPC601 



Mnemonic 


Instruction 


fres 


Floating-Point Reciprocal Estimate Single-Precision 


frsqrte 


Floating-Point Reciprocal Square Root Estimate 


fsel 


Floating-Point Select 


fsqrt 


Floating-Point Square Root 


fsqrts 


Floating-Point Square Root Single-Precision 


mftb 


Move from Time Base 


stfiwx 


Store Floating-Point as Integer Word Indexed 


tibia 


Translation Lookaside Buffer Invalidate All 


tibiex 


Translation Lookaside Buffer Invalidate Entry by Index 


tibsync 


Translation Lookaside Buffer Synchronize 



Table C-2 provides a list of 32-bit SPR encodings that are not implemented by the 
MPC6()1. 
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Table C-2. 32-Bit SPR Encodings Not Implemented by the MPC601 



SPR 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


284 


01000 


11100 


TB 


Supervisor 


285 


01000 


11101 


TBU 


Supervisor 


536 


10000 


11000 


DBATOU 


Supervisor 


537 


10000 


11001 


DBATOL 


Supervisor 


538 


10000 


11010 


DBAT1 U 


Supervisor 


539 


10000 


11011 


DBAT1 L 


Supervisor 


540 


10000 


11100 


DBAT2U 


Supervisor 


541 


10000 


11101 


DBAT2L 


Supervisor 


542 


10000 


11110 


DBAT3U 


Supervisor 


543 


10000 


11111 


DBAT3L 


Supervisor 



Table C-3 provides a list of 64-bit instructions that are not implemented by the MPC6()1, 
and that generate an illegal instruction exception. 

Table C-3. 64-Bit Instructions Not Implemented by the MPC601 



Mnemonic 


instruction 


cntlzd 


Count Leading Zeros Double Word 


divd 


Divide Double Word 


divdu 


Divide Double Word Unsigned 


extsw 


Extend Sign Word 


fcfid 


Floating Convert From Integer Double Word 


fetid 


Floating Convert to Integer Double Word 


fctidz 


Floating Convert to Integer Double Word with Round to Zero 


Id 


Load Double Word 


Idarx 


Load Double Word and Reserve Indexed 


Idu 


Load Double Word with Update 


Idux 


Load Double Word with Update Indexed 


Idx 


Load Double Word Indexed 


Iwa 


Load Word Algebraic 


Iwaux 


Load Word Algebraic with Update Indexed 


Iwax 


Load Word Algebraic Indexed 


mulld 


Multiply Low Double Word 
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Table C-3. 64-Bit Instructions Not Implemented by the MPC601 (Continued) 



Mnemonic 


Instruction 


muihd 


Multiply Highi Double Word 


mullidu 


Multiply High Double Word Unsigned 


ridci 


Rotate Left Double Word then Clear Left 


rider 


Rotate Left Double Word then Clear Right 


ridic 


Rotate Left Double Word Immediate then Clear 


ridici 


Rotate Left Double Word Immediate then Clear Left 


ridicr 


Rotate Left Double Word Immediate then Clear Right 


ridimi 


Rotate Left Double Word Immediate then Mask Insert 


sibia 


SLB Invalidate All 


sibie 


SLB Invalidate Entry 


sibiex 


SLB Invalidate Entry by Index 


Sid 


Shift Left Double Word 


srad 


Shift Right Algebraic Double Word 


sradi 


Shift Right Algebraic Double Word Immediate 


srd 


Shift Right Double Word 


std 


Store Double Word 


stdcx. 


Store Double Word Conditional Indexed 


stdu 


Store Double Word with Update 


stdux 


Store Double Word Indexed with Update 


stdx 


Store Double Word Indexed 


td 


Trap Double Word 


tdi 


Trap Double Word Immediate 



Table C-4 provides the 64-bit SPR encoding that is not implemented by the MPC601 . 
Table C-4. 64-Bit SPR Encoding Not Implemented by the MPC601 



SPR 


Register 
Name 


Access 


Decimal 


SPR[5-9] 


SPR[0-4] 


280 


01000 


11000 


ASR 


Supervisor 



MOTOROLA Appendix C. PowerPC instructions Not impiemented in iVIPCeOI 



C-3 



Cntlzd Not Implemented in MPC601 

Count Leading Zeros Double Word 



cntlzd 
cntlzd. 



rA,rS 
rA,rS 



(Rc=()) 
(Rc=l) 



5 6 



10 11 



15 16 



20 21 



cntlzd 

Integer Unit 



m Reserved 



31 


S 


A 


00000 


58 


Re 



30 31 



N<-0 

do while N<64 

ifrS[N]=l then leave 

N<-N+l 
rA<-N 



A count of the number of consecutive zero bits starting at bit of register rS is placed into 
rA. This number ranges from to 64, inclusive. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 



Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(Rc=l) 



divd Not 


Implemented in 


Divide Double Word 




divd rD,rA,rB 


(OE=0 Rc=0) 


divd. rD,rA,rB 


(OE=0 Rc=l) 


divdo rD,rA,rB 


(0E=1 Rc=0) 


divdo. rD,rA,rB 


(0E=1 Rc=l) 



divd 

Integer Unit 



31 


D 


A 


B 


OE 


489 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



dividend[0-63]<-rA 
divisor[0-63]<— rB 
rD<— dividend+divisor 
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The 64-bit dividend is rA. The 64-bit divisor is rB. The 64-bit quotient of the dividend and 
divisor is placed into rD. The remainder is not supplied as a result. 

Both the dividend and the divisor are interpreted as signed integers. The quotient is the 
unique signed integer that satisfies 

dividend=(quotient*divisor)+r 

where < r < Idivisorl if the dividend is nonnegative, and -Idivisorl < r< if the dividend is 
negative. 

If an attempt is made to perform any of the divisions 

()x8()(){)_()()(){)_()0(X)_()(X)() - -1 
<anything> -j- 

then the contents of rD are undefined. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 



Affected: LT, GT, EQ, 


SO 


(ifRc=l) 


• Exception Register: 






Affected: SO, OV 




(ifOE=l) 


d i Vd U Not Implemented in MPC601 


Divide Double Word 






divdu rD,rA,rB 




(OE=0 Rc=()) 


divdu. rD,rA,rB 




(OE=0Rc=l) 


divduo rD,rA,rB 




(0E=1 Rc=0) 


divduo. rD,rA,rB 




(0E=1 Rc=l) 



divdu 

Integer Unit 



31 


D 


A 


B 


OE 


457 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



divideiid[0-63]<-rA 
divisor[0-63]<-rB 
rD<— dividend+divisor 

The 64-bit dividend is rA. The 64-bit divisor is rB. The 64-bit quotient of the dividend and 
divisor is placed into rD. The remainder is not supplied as a result. 
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Both the dividend and the divisor are interpreted as unsigned integers. The quotient is the 
unique unsigned integer that satisfies 

dividend=(quotient*divisor)+r 
where < r < divisor. 
If an attempt is made to perform the division 

<anything> -f- 

then the contents of rD are undefined. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Exception Register: 

Affected: SO, OV (ifOE=l) 



eXtSW Not implemented in iViPCGOl 

Extend Sign Word 



extsw 
extsw. 



rA,rS 
rA,rS 



(Rc=0) 
(Rc=l) 



extsw 



5 6 



10 11 



15 16 



20 21 



iil Reserved 



31 


S 


A 


00000 


986 


Re 



30 31 



s<-rS[321 

rA[32-63]<-rS[32-63] 

rA[0-31]<-(32)s 

Register rS[32-63] are placed into rA[32-63]. Bit 32 of rS is placed into rA[()-31 ]. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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f Cf id Not Implemented in MPC601 

Floating Convert from Integer Double Word 



fcfid 
fcfid. 



frD,frB 
frD,frB 



(Rc=()) 
(Rc=l) 



fcfid 

Floating-Point Unit 



m Reserved 



• 63 


frD 


ODOOO 


frB 


846 


Re 



5 6 10 11 15 16 20 21 30 31 

The 64-bit signed fixed-point operand in register frB is converted to an infinitely precise 
floating-point integer. If the result of the conversion is already in double-precision range it 
is placed into register frD. Otherwise the result of the conversion is rounded to double- 
precision using the rounding mode specified by FPSCRfRN] and placed into register frD. 

FPSCR[FPRF] is set to the class and sign of the result. FPSCR[FR] is set if the result is 
incremented when rounded. FPSCR[FI] is set if the result is inexact. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• Exception Register: 
Affected: FPRF, FR, FI, FX, XX 



fetid Not Implemented In MPC601 

Floating Convert to Integer Double Word 



fetid 



fetid 
fetid. 



frD,frB 
frD,frB 



(Rc=0) 
(Rc=l) 



m Reserved 



63 


frD 


00000 


frB 


814 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



The floating-point operand in frB is converted to a 64- bit signed fixed-point integer, using 
the rounding mode specified by FPSCR[RN], and placed into frD. 

If the operand in frB is greater than 2(63)- 1 , then frD is set to 0x7FFF_FFFF_FFFF_FFFF. 
If the operand in frB is less than -2(63), then frD is set to ()x80(){)_()()00_000()_0(X)(). 
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Except for enabled invalid operation exceptions, FPSCR[FPRF] is undefined. FPSCR[FR] 
is set if the result is incremented when rounded. FPSCR[FI] is set if the result is inexact. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• Exception Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN VXCVI 



fCtidZ Not Implemented In MPC601 

Floating Convert to Integer Double Word 



fctidz 



fctidz 
fctidz. 



frD,frB 
frD,frB 



(Rc=0) 
(Rc=l) 



m Reserved 



63 


frD 


OODOO 


frB 


815 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



The floating-point operand in frB is converted to a 64-bit signed fixed-point integer, using 
the rounding mode round toward zero, and placed into frD. 

If the operand in frB is greater than 2(63)- 1 , then frD is set to ()x7FFF_FFFF_FFFF_FFFF 
If the operand in frB is less than -2(63), then frD is set to Ox80(X)JX)0()_0()(){)_0(){)0. 

Except for enabled invalid operation exceptions, FPSCR[FPRF] is undefined. FPSCR[FR] 
is set if the result is incremented when rounded. FPSCR[FI] is set if the result is inexact. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

• Exception Register: 

Affected: FPRF (undefined), FR, FI, FX, XX, VXSNAN VXCVI 
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f reSX Not Implemented in MPC601 

Floating-Point Reciprocal Estimate Single-Precision 



fresx 

Floating-Point Unit 



fres 
fres. 



frD,frB 
frD,frB 



(Rc={)) 
(Rc=l) 



m Reserved 



59 


frD 


ODOOO 


frB 


IIIJIIIIIIII 


24 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



This PowerPC instruction is not implemented by the MPC601. Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

A single-precision estimate of the reciprocal of the floating-point operand in register frB is 
placed into register frD. The estimate placed into register frD is correct to a precision of 
one part in 256 of the reciprocal of frB, 

Operation with various special values of the operand, is summarized below. 



Operand 


Result 


Exception 


-oo 


-0 


None 


-0 


-00* 


ZX 


+0 


-Ko* 


zx 


+00 


+0 


None 


SNaN 


QNaN** 


VXSNAN 


QNaN 


QNaN 


None 



* No result if FPSCR[ZE1=1. 
** No result if FPSCR[VE]=1. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1 and zero divide exceptions when FPSCR[ZE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FX, OX, UX, ZX, VXSNAN, FPRF, FR (undefined), FI (undefined) 
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f rSqrtG Not implemented In MPC601 

Floating-Point Reciprocal Square Root Estimate 



frsqrte 
frsqrte. 



frD,frB 
frD,frB 



(Rc=0) 
(Rc=l) 



frsqrte 

Floating-Point Unit 



m Reserved 



63 


frD 


00000 


frB 


00000 


26 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



This PowerPC instruction is not implemented by the MPC601 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

A double-precision estimate of the reciprocal of the square root of the floating-point 
operand in register frB is placed into register frD. The estimate placed into register frD is 
correct to a precision of one part in 32 of the reciprocal of the square root of frB. 

Operation with vaiious special values of the operand is summarized below. 



Operand 


Result 


Exception 


-oo 


QNaN** 


VXSQRT 


<0 


QNaN** 


VXSQRT 


-0 


-oo* 


ZX 


-K) 


+00* 


ZX 


+00 


+0 


None 


SNaN 


QNaN** 


VXSNAN 


QNaN 


QNaN 


None 



* No result if FPSCR[ZE]=I. 
** No result if FPSCR[VE]=1. 

FPSCR[FPRF] is set to the class and sin of the result, except for invalid operation 
exceptions when FPSCR[VE]=1 and zero divide exceptions when FPSCR[ZE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 

Affected: FX, OX, UX, ZX, VXSNAN, FPRF, FR (undefined), FI (undefined) 
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fselx 

Floating-Point Select 



Not Implemented in MPC601 



fsel frD,frA,frC,frB 

fsel. frD,frA,frC,frB 



fselx 

Floating-Point Unit 



(Rc=0) 
(Rc=l) 



63 


frD 


frA 


frB 


frC 


23 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



if (frA) > 0.0 then frD<-(frC) 
else frD<-(frB) 

This PowerPC instruction is not implemented by the MPC6()1 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

The floating-point operand in register frA is compared to the value zero. If the operand is 
greater than or equal to zero, register frD is set to the contents of register frC. If the operand 
is less than zero or is a NaN, register frD is set to the contents of register frB. The 
comparison ignores the sign of zero (i.e., regards +0 as equal to -0). 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc= 1 ) 

Care must be taken in using fsel if IEEE compatibility is required, or if the values being 
tested can be NaNs or infinities. 



f Sq VtX Not Implemented In MPC601 

Floating-Point Square Root [Single-Precision] 



fsqrt 
fsqrt. 



frD,frB 
frD,frB 



(Rc=()) 
(Rc=l) 



fsqrtx 

Floating-Point Unit 



m Reserved 



63 


frD 


00000 


frB 


00000 


22 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 
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fsqrts 
fsqrts. 



frD,frB 
frD,frB 



(Rc=0) 
(Rc=l) 



m Reserved 



59 


frD 


DOOOO 


frB 


00000 


22 


Re 



5 6 



10 11 



15 16 



20 21 



25 26 



30 31 



This PowerPC instruction is not implemented by the MPC6()1 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

The square root of the floating-point operand in register frB is placed into register frD. 

If the most significant bit of the resultant significand is not a one the result is normalized. 
The result is rounded to the target precision under control of the floating-point rounding 
control field RN of the FPSCR and placed into register frD. 

Operation with various special values of the operand is summarized below. 



Operand 


Result 


Exception 


-co 


QNaN* 


VXSQRT 


<0 


QNaN* 


VXSQRT 


-0 


-0 


None 


+00 


+00 


None 


SNaN 


QNaN* 


VXSNAN 


QNaN 


QNaN 


None 



* No result if FFSCR[VE]=1. 

FPSCR[FPRF] is set to the class and sign of the result, except for invalid operation 
exceptions when FPSCR[VE]=1. 

Other registers altered: 

• Condition Register (CRl Field): 

Affected: FX, FEX, VX, OX (if Rc=l ) 

• Floating-point Status and Control Register: 
Affected: FX, XX, VXSQRT, VXSNAN, FPRF, FR, FI 
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Not Implemented in MPC601 



Load Double Word 

1(1 rD,ds(rA) 



Id 

Integer Unit 



58 


D 


A 


ds 


slilliil 



5 6 



10 11 



15 16 



29 30 31 



if rA=0 llien b<-0 
else b<— rA 

EA<-b+EXTS(dsllObOO) 
rD<-MEM(EA, 8) 

EA is the sum (rAI())+(dsllObOO). The double word in storage addressed by EA is loaded 
into rD. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



Not Implemented in MPC601 

Load Double Word and Reserve Indexed 



Idarx 



rD,rA,rB 



5 6 



10 11 



15 16 



20 21 



Idarx 

Integer Unit 



I 1 Reserved 



31 


D 


A 


B 


84 






30 31 



if rA=0 then b<-0 

else b<— rA 

EA<-b+rB 

RESERVE<-1 

RESERVE. ADDR<-func(EA) 

rD4-MEM(EA, 8) 



EA is the sum (rAIO)+(rB). The double word in storage addressed by EA is loaded into rD. 

This instruction creates a reservation for use by a store double word conditional instruction. 
An address computed from the EA is associated with the reservation, and replaces any 
address previously associated with the reservation. 
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EA must be a multiple of 8. If it is not, the system alignment error handler may be invoked 
or the results may be boundedly undefined. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



IdU Not Implemented in MPC601 

Load Double Word with Update 

Idu rD,ds(rA) 



Idu 

Integer Unit 



58 


D 


A 


ds 


1 



5 6 



10 11 



15 16 



29 30 31 



EA<-rA+EXTS(dsllObOO) 
rD^MEM(EA, 8) 
rA<-EA 

EA is the sum (rA)+(dsllOb{K)). The double word in storage addressed by EA is loaded into 
rD. 

EA is placed into rA. 

If rA=0 or rA=rD, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 
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IdUX Not Implemented in MPC601 

Load Double Word with Update Indexed 



Idux 



rD,rA,rB 



5 6 



10 11 



15 16 



20 21 



Idux 

Integer Unit 

ilj Reserved 



31 


D 


A 


B 


53 


III 



30 31 



EA<-rA+ rB 
rDf-MEM(EA, 8) 
rA<-EA 

EA is the sum (rA)+(rB). The double word in storage addressed by EA is loaded into rD. 

EA is placed into rA. 

If rA=0 or rA=rD, the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction to be invoked. 



IdX Not Implemented in MPC601 

Load Double Word Indexed 



Idx 



rD,rA,rB 



5 6 



10 11 



15 16 



20 21 



Idx 



[ill Reserved 



31 


D 


A 


B 


21 


III 



30 31 



if rA = then bf-0 
else b<— rA 

EA<-tH-rB 
rD<-MEM(EA, 8) 

EA is the sum (rAIO)+(rB). The double word in storage addressed by EA is loaded into rD. 

This instruction is defined only for 64-bit implementations. Using \t on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 
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Iwa Not implemented in i\/IPC601 

Load Word Algebraic 



Iwa 



rD,ds(rA) 



Iwa 

Integer Unit 



58 


D 


A 


ds 


2 



5 6 



10 11 



15 16 



29 30 31 



if rA=0 then b<-0 
else b<— rA 

EA<-b+EXTS(d.sllObOO) 
rD<-EXTS(MEM(EA, 4)) 

EA is the sum (rAI())+(dsllObO()). The word in storage addressed by EA is loaded into 
rD[32-63]. Register rD[0-31] are filled with a copy of bit of the loaded word. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



IWaUX Not implemented in iVIPC601 

Load Word Algebraic with Update Indexed 



Iwaux 



rD,rA,rB 



5 6 



10 11 



15 16 



20 21 



Iwaux 



[ill Reserved 



31 


D 


A 


B 


373 






30 31 



EA<-rA+rB 

rD<-EXTS(MEM(EA, 4)) 
rA<-EA 

EA is the sum (rA)+(rB). The word in storage addressed by EA is loaded into rD[32-63]. 
Register rD[()-31] are filled with a copy of bit of the loaded word. 

EA is placed into rA. 

If rA=0 or rA=rD, the instruction form is invalid. 
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This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



IWaX Not Implemented in MPC601 

Load Word Algebraic Indexed 



Iwax 



rD,rA,rB 



Iwax 



|ii] Reserved 



31 


D 


A 


B 


341 






5 6 



10 11 



15 16 



20 21 



30 31 



if rA=0 then b4-0 
else b<— rA 

EA<-b + rB 
rD<-EXTS(MEM(EA, 4)) 

EAis the sum (rAIO)+(rB). The word in storage addressed by EAis loaded into rD[32-63]. 
Register rD[0-31] are filled with a copy of bit of the loaded word. 

This instruction is defined only for 64-bit implementafions. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



mftb Not Implemented in MPC601 

Move from Time Base 



mftb 



rD,TBR 



mftb 

Integer Unit 



5 6 



10 11 



m Reserved 



31 


D 


TBR 


371 


II 



20 21 



30 31 



N<-TBR[5-9] II TBR(0-4] 
ifN=268then 

if (64-bit implementation) then 
rD^TB 

else 
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rD <- TB[32-63] 
else if N=269 then 
if (64-bit implementation) then 

rD<-(32)0IITB[0-31J 
else 

rD<-TB[0-31] 

The TBR field denotes either the time base or time base upper, encoded as shown in 
Table C-5. The contents of the designated register are copied to rD. When reading Time 
Base Upper on a 64-bit implementation, the high-order 32 bits of rD are set to zero. 



Table C-5. TBR Encodings for mftb 



Decimal 


TBR* 
TBR[5-9] TBR[0-4] 


Register 
Name 


Access 


268 


01000 


01100 


TB 


User 


269 


01000 


01101 


TBU 


User 



*Note that the order of the two 5-bit halves of the TBR number is 
reversed. 

If the TBR field contains any value other than one of the values shown in Table C-5, the 
instruction form is invalid. 

Other registers altered: 
None 



HI U I h d Not Implemented in MPC601 

Multiply High Double Word 



mulhd 
mulhd. 



rD,rA,rB 
rD,rA,rB 



(Rc=()) 
(Rc=l) 



mulhd 

Integer Unit 



31 


D 


A 


B 


III 


73 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



prod[0-127]<-rA*rB 
rD<-prod[0-63] 

The 64-bit multiplicands are rA and rB. The high-order 64 bits of the 128-bit product of 
the multiplicands are placed into rD. 

Both the multiplicands and the product are interpreted as signed integers. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 
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Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(ifRc=l) 



nriUlhdU Not implemented in MPC601 

Multiply High Double Word Unsigned 



mulhdu 
mulhdu. 



rD,rA,rB 
rD,rA,rB 



(Rc=0) 
(Rc=l) 



mulhdu 

Integer Unit 



31 


D 


A 


B 


ill 


9 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 



prod[0-127]^rA*rB 
rD<-prod[0-63] 

The 64-bit multiplicands are rA and rB. The high-order 64 bits of the 128-bit product of 
the multiplicands are placed into rD. 

Both the multiplicands and the product are interpreted as unsigned integers. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 



mulld Not Implemented in MPC601 

Multiply Low Double Word 



mulld 


rD,rA,rB 


(OE=0 Rc=0) 


mulld. 


rD,rA,rB 


(OE=0Rc=l) 


muUdo 


rD,rA,rB 


(0E=1 Rc=0) 


mulldo. 


rD,rA,rB 


(0E=1 Rc=l) 



mulld 

Integer Unit 



31 


D 


A 


B 


OE 


233 


Re 



5 6 



10 11 



15 16 



20 21 22 



30 31 
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prod[0-1271f-rA*rB 
rDf-prod[64-127] 

The 64-bit operands are rA and rB. The low-order 64 bits of the 1 28-bit product of the 
operands are placed into rD. 

If 0E=1 , then SO and OV are set to one if the product cannot be represented in 64 bits. 

Both the operands and the product are interpreted as signed integers. However, the result in 
rD is independent of whether the operands are interpreted as signed or unsigned integers. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Exception Register: 

Affected: SO OV (ifOE=l) 



rIdCl Not Implemented In MPC601 

Rotate Left Double Word then Clear Left 



ridcl 
rldcl. 



rA,rS,rB,MB 
rA,rS,rB,MB 



(Rc=()) 
(Rc=l) 



rldcl 

Integer Unit 



30 


S 


A 


B 


MB 


8 


Re 



5 6 



10 11 



15 16 



20 21 



26 27 



30 31 



N<-rB[58-63] 
r<-ROTL[641(rS, N) 
b<-MB[5] II MB[0^] 
m^MASK(b, 63) 
rA<— r & m 

The contents of rS are rotated[64] left the number of bits specified by rB[58-63]. A mask 
is generated having 1-bits from bit MB through bit 63 and 0-bits elsewhere. The rotated 
data is ANDed with the generated mask and the result is placed into rA. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 
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Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 

rider Not Implemented in MPC601 

Rotate Left Double Word then Clear Right 



rider 
rider. 



rA,rS,rB,ME 
rA,rS,rB,ME 



(Rc=()) 
(Rc=l) 



rider 

Integer Unit 



30 


S 


A 


B 


ME 


9 


Re 



5 6 



10 11 



15 16 



20 21 



26 27 



30 31 



N<-rB[58-63] 
r<-ROTL164](rS, N) 
e<-ME15] II MElO-4] 
m«-MASK(0, e) 
rA<— r & m 

The contents of rS are rotated[64] left the number of bits specified by rB[58-63]. A mask 
is generated having 1-bits from bit through bit ME and 0-bits elsewhere. The rotated data 
is ANDed with the generated mask and the result is placed into rA. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l) 



rIdiC Not Implemented in MPC601 

Rotate Left Double Word Immediate then Clear 



rldie 
ridie. 



rA,rS,SH,MB 
rA,rS,SH,MB 



(Rc=0) 
(Rc=l) 



rIdic 

Integer Unit 



30 


S 


A 


SH 


MB 


2 


SH 


Re 



5 6 



10 11 



15 16 



20 21 



26 27 



29 30 31 



N<-SH[51 II SH[0-41 
r<-R()TL[641(rS, N) 
b<-MBl5] II MB[0-4] 
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m<-MASK(b, -N) 
rA<— r & m 

The contents of rS are rotated[64] left SH bits, A mask is generated having 1-bits from bit 
MB through bit 63-SH and 0-bits elsewhere. The rotated data is ANDed with the generated 
mask and the result is placed into rA, 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 



Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(ifRc=l) 



rid id Not Implemented In MPC601 

Rotate Left Double Word Immediate then Clear Left 



rldicl 
ridicl. 



rA,rS,SH,MB 
rA,rS,SH,MB 



(Rc=()) 
(Rc=l) 



rldicl 

Integer Unit 



30 


S 


A 


SH 


MB 





SH Re 



5 6 



10 11 



15 16 



20 21 



26 27 



29 30 31 



N<-SH(5] II SH[0-4] 
r<-ROTL[64](rS, N) 
b<-MB[51 II MB[0-4] 
m<-MASK(b, 63) 
rA<— r & m 

The contents of rS are rotated[64] left SH bits, A mask is generated having 1-bits from bit 
MB through bit 63 and 0-bits elsewhere. The rotated data is ANDed with the generated 
mask and the result is placed into rA. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 
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ridicr Not implemented in MPC601 

Rotate Left Double Word Immediate then Clear Right 



ridicr 
ridicr. 



rA,rS,SH,ME 
rA,rS,SH,ME 



(Rc=0) 
(Rc=l) 



ridicr 

Integer Unit 



30 


S 


A 


SH 


ME 


1 


SH 


Re 



5 6 



10 11 



15 16 



20 21 



26 27 



29 30 31 



N<-SH15] II SHlO-41 
r^ROTL[64](rS, N) 
e<-ME[5] II ME[0-4] 
m<-MASK(0, e) 
rA<— r & m 

The contents of rS are rotated[64] left SH bits. A mask is generated having 1-bits from bit 
ME and 0-bits elsewhere. The rotated data is ANDed with the generated mask and the result 
is placed into rA. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc= 1 ) 



ridimi Not implemented in MPC601 

Rotate Left Double Word Immediate then Clear Left 



ridimi 
ridimi. 



rA,rS,SH,MB 
rA,rS,SH,MB 



(Rc=()) 
(Rc=l) 



ridimi 

Integer Unit 



30 


S 


A 


SH 


MB 


3 


SH 


Re 



5 6 



10 11 



N<-SH[5]IISH[0-4] 
r<-ROTL[64](rS, N) 
b^MB[5] II MB[0-4] 
m<-MASK(b, -N) 
rA<-(r & m) I (r A & -m) 



15 16 



20 21 



26 27 



29 30 31 
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The contents of rS are rotated[64] left SH bits. A mask is generated iiaving 1-bits from bit 
MB through bit 63-SH and 0-bits elsewhere. The rotated data is inserted into rA under 
control of the generated mask. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 



Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(ifRc=l) 



sibia 

SLB Invalidate All 



Not Implemented in MPC601 



sIbia 



5 6 



10 11 



15 16 



20 21 



liij Reserved 



31 


•00000 • 


ODODO 


00000 


498 


III 



30 31 



All SLB entries<— invalid 

This PowerPC instruction is not implemented by the MPC601 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

The SLB is invalidated regardless of the settings of MSR[IR] and MSR[DR]. 

This instruction is supervisor-level. 

This instruction is optional in PowerPC architecture. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause an illegal instruction type program interrupt. 

It is not necessary that the ASR point to a valid segment table when issuing slbia. 

Other registers altered: 
None 
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S I b ie Not Implemented in MPC601 

SLB Invalidate Entry 



slbie 



rB 



slbie 



5 6 



10 11 



15 16 



20 21 



jlij Reserved 



31 






B 


434 


Hi 


00000 


00000 



30 31 



EA<-(rB) 

if SLB entry exists for EA, then 
SLB entry <— invalid 

This PowerPC instruction is not implemented by the MPC601. Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

EA is the contents of rB. If the segment lookaside buffer (SLB) contains an entry 
corresponding to EA, that entry is made invalid (i.e., removed from the SLB). 

The SLB search is done regardless of the settings of MSR[IR] and MSR[DR]. 

Block address translation for EA, if any, is ignored. 

This instruction is supervisor-level. 

This instruction is optional in PowerPC architecture. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause an illegal instruction type program interrupt. 

Other registers altered: 

None 

It is not necessary that the ASR point to a valid segment table when issuing slbie. 
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SibieX Not Implemented in MPC601 

SLB Invalidate Entry by Index 



sIbiex 



rB 



sIbiex 



5 6 



10 11 



15 16 



20 21 



HI Reserved 



31 


00000 


00000 


B 


466 


11 



30 31 



N<-(rB) 

SLB entry N<— invalid 

This PowerPC instruction is not implemented by the MPC601 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

The SLB entry is invalidated regardless of the settings of MSR[IR] and MSR[DR]. 

This instruction is supervisor-level. 

This instruction is optional in PowerPC architecture. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause an illegal instruction type program interrupt. 

Other registers altered: 
None 



Sid 



Not Implemented In MPC601 



Shift Left Double Word 



sId 
sld. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



Sld 

Integer Unit 



31 


S 


A 


B 


27 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



N<-rB[58-63] 
r<-ROTL[641(rS, N) 
if rB[57]=0 then 

m«-MASK(0, 63 -N) 
else m<-(64)0 
rA<— r & m 
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The contents of rS are shifted left the number of bits specitied by rB[57-63]. Bits shifted 
out of position are lost. Zeros are supplied to the vacated positions on the right. The result 
is placed into rA. Shift amounts from 64 to 127 give a zero result. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 



Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 



(ifRc=l) 



Srad Not Implemented in MPC601 

Shift Right Algebraic Double Word 



srad 

Integer Unit 



srad 
srad. 



rA,rS,rB 
rA,rS,rB 



(Rc=0) 
(Rc=l) 



31 


S 


A 


B 


794 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



N<-rBl58-631 
r^ROTL[641(rS, 64-N) 
ifrBl57]=0then 

m<-MASK(N, 63) 
else m^(64)0 
s4-rS[0] 

rA<-(r & m) I (((64)s) & -m) 
CA<-s & ((r&-iTi)5tO) 

The contents of rS are shifted right the number of bits specified by rB[57-63]. Bits shifted 
out of position 63 are lost. Bit of rS is replicated to fill the vacated positions on the left. 
The result is placed into rA. CA is set to 1 if rS is negative and any 1-bits are shifted out 
of position 63; otherwise CA is set to 0. A shift amount of zero causes rA to be set equal to 
rS, and CA to be set to 0. Shift amounts from 64 to 127 give a result of 64 sign bits in rA, 
and cause CA to receive the sign bit of rS. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 
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Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 

• Exception Register: 
Affected: CA 



(ifRc=l) 



Sradi Not implemented In MPC601 

Shift Right Algebraic Double Word Immediate 



sradi 
sradi. 



rA,rS,SH 
rA,rS,SH 



(Rc=0) 
(Rc=l) 



sradi 

Integer Unit 



31 


S 


A 


SH 


413 


SH 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



Nf-SH[5| II SH[0-41 

r<-ROTL[64|(rS, 64-N) 

m<-MASK(N, 63) 

s^rS[0] 

rA<-(r & m) I (((64)s) & -m) 

CA<-s & ((r&-m)^0) 

The contents of rS are shifted right SH bits. Bits shifted out of position 63 are lost. Bit of 
rS is replicated to fill the vacated positions on the left. The result is placed into rA. CA is 
set to 1 if rS is negative and any 1-bits are shifted out of position 63; otherwise CA is set 
to 0. A shift amount of zero causes rA to be set equal to rS, and CA to be set to 0. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 

• Exception Register: 
Affected: CA 
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srd 



Shift Right Double Word 



Not Implemented in MPC601 



srd 
srd. 



rA,rS,rB 
rA,rS,rB 



(Rc=()) 
(Rc=l) 



srd 

integer Unit 



31 


S 


A 


B 


539 


Re 



5 6 



10 11 



15 16 



20 21 



30 31 



N<-rB[58-63] 
r<-ROTL[641(rS, 64-N) 
if rB[57]=0 then 

m<-MASK(N, 63) 
else m<-(64)0 
rA<— r & m 

Tlie contents of rS are shifted right the number of bits specified by rB[57-63]. Bits shifted 
out of position 63 are lost. Zeros are supplied to the vacated positions on the left. The result 
is placed into rA. Shift amounts from 64 to 127 give a zero result. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 

Affected: LT, GT, EQ, SO (if Rc=l ) 



std 

Store Double Word 



std 



Not Implemented in MPC601 

rS,ds(rA) 



std 

Integer Unit 



62 


S 


A 


ds 






5 6 



10 11 



15 16 



29 30 31 



if rA=0 then b<-0 
else b<— rA 

EA<-b + EXTS(dsIIObOO) 
(MEM(EA, 8))<-rS 

EA is the sum (rAIO)+(dsllOb(X)). Register rS is stored into the double word in storage 
addressed by EA. 
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This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



StdCX. Not Implemented in MPC601 

store Double Word Indexed 



stdcx. 



rS,rA,rB 



Stdcx. 

Integer Unit 



31 


S 


A 


B 


214 


1 



5 6 



10 11 



15 16 



20 21 



30 31 



if rA=0 then b^O 
else b<— rA 

EA<-b + rB 
if RESERVE then 

(MEM(EA, 8))<-rS 

RESERVE<-0 
CRO<-ObOOIIOblllXER[SO] 
else 

CRO<-ObOO II ObO II XER[SO] 

EAis the sum (rAIO)+(rB). 

If a reservation exists, rS is stored into the double word in storage addressed by EAand the 
reservation is cleared. 

If a reservation does not exist, the instruction completes without altering storage. 

CRO Field is set to reflect whether the store operation was performed (i.e., whether a 
reservation existed when the stdcx. instruction commenced execution), as follows: 

CR()[LT GT EQ SO] + ObOO II store_performed II XER[SO] 

EA must be a multiple of 8. If it is not, the system alignment error handler may be invoked 
or the results may be boundedly undefined. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 

• Condition Register (CRO Field): 
Affected: LT, GT, EQ, SO 
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Std U Not Implemented in MPC601 

Store Double Word with Update 



stdu 

Integer Unit 



stdu 



rS,ds(rA) 



62 


S 


A 


ds 


1 



5 6 



10 11 



15 16 



29 30 31 



EA<-rA+EXTS(dsllObOO) 
(MEM(EA, 8))<-rS 
rA<-EA 

EA is the sum (rA)+(dsllOb{K)), Register rS is stored into the double word in storage 
addressed by EA. 

EA is placed into rA. 

If rA=(), the instruction form is invalid. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



StdUX Not Implemented in MPC601 

Store Double Word with Update Indexed 

stdux rS,rA,rB 



stdux 

Integer Unit 



m Reserved 



31 


S 


A 


B 


181 


III 



5 6 10 11 15 16 20 21 30 31 

EA<-rA + rB 
MEM(EA, 8)<-rS 
rA4-EA 

EA is the sum (rA)+(rB). Register rS is stored into the double word in storage addressed 
byEA. 

EA is placed into rA. 

If rA=0, the instruction form is invalid. 
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This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



StdX Not Implemented in MPC601 

Store Double Word Indexed 



stdx 



rS,rA,rB 



5 6 



10 11 



15 16 



20 21 



Stdx 

Integer Unit 



HI Reserved 



31 


S 


A 


B 


149 


III 



30 31 



if rA=0 then b<-0 
else b<— rA 

EA<-b+rB 
(MEM(EA, 8))<-rS 

EA is the sum (rAI())+(rB). Register rS is stored into the double word in storage addressed 
byEA. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



Stf iWX Not Implemented in MPC601 

Store Floating-Point as Integer Word 

stfiwx frS.rA.rB 



5 6 



10 11 



15 16 



20 21 



Stfiwx 

Floating-Point Unit 



m Reserved 



31 


frS 


A 


B 


983 






30 31 
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if rA = then b<-0 

else b<— rA 

EA<-b+rB 

MEM(EA, 4)<-frSI32-63] 

EA is the sum (rAIO)+(rB). 

The contents for the low-order 32 bits of register frS are stored, without conversion, into 
the word in storage addressed by EA. 

Other registers altered: 
None 



tCl Not Implemented in MPC601 

Trap Double Word 

td TO,rA,rB 



td 

Integer Unit 



in Reserved 



31 


TO 


A 


B 


68 


ill 



5 6 



10 11 



15 16 



20 21 



30 31 



a<-rA 

b<-rB 

if(a<b)&TO[0]thenTRAP 

if(a>b)&TOIl]thenTRAP 

if(a = b)&T0[21tlienTRAP 

if (a u< b) & T()[3] then TRAP 

if (a u> b) & T0I4] then TRAP 

The contents of rA is compared with the contents of rB. If any bit in the TO field is set to 
1 and its corresponding condition is met by the result of the comparison, then the system 
trap handler is invoked. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 
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tdi Not Implemented in MPC601 

Trap Double Word Immediate 



tdi 



TO,rA,SIMM 



tdi 

Integer Unit 



02 


TO 


A 


SIMM 



5 6 



10 11 



15 16 



31 



ci'f-rA 

if (a < EXTS(SIMM)) & TO[0] then TRAP 
if (a > EXTS(SIMM)) & TQ[ 1 ] then TRAP 
if (a = EXTS(SIMM)) & T0[21 then TRAP 
if (a u< EXTS(SIMM)) & T0[31 then TRAP 
if (a u> EXTS(SIMM)) & T0[41 then TRAP 

The contents of rAare compared with the sign-extended SIMM tield. If any bit in the TO 
tield is set to 1 and its corresponding condition is met by the result of the comparison, then 
the system trap handler is invoked. 

This instruction is defined only for 64-bit implementations. Using it on a 32-bit 
implementation will cause the system illegal instruction error handler to be invoked. 

Other registers altered: 
None 



tibia Not Implemented in MPC601 

Translation Lookaside Buffer Invalidate All 



tibia 

Integer Unit 



5 6 



10 11 



15 16 



20 21 



HI Reserved 



31 


00000 


00000 


00000 


NA 


III 



30 31 



All TLB entries <— invalid 

This PowerPC instruction is not implemented by the MPC601 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

The entire TLB is invalidated (i.e., all entries are removed). 

The TLB is invalidated regardless of the settings of MSR[IT] and MSR[DT]. 

This is a supervisor-level instruction. 
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This instruction is optional in PowerPC architecture. 

Other registers altered: 

None 

It is not necessary that the ASR point to a valid segment table when issuing tibia. 



tIbiGX Not Implemented in MPC601 

Translation Lookaside Buffer Invalidate Entry by Index 



tibiex 



rB 



tibiex 

Integer Unit 



lil Reserved 



31 



OQOQO 



00000 



NA 



5 6 



10 11 



15 16 



20 21 



30 31 



N<-(rB) 

TLB entry N<— invalid 

This PowerPC instruction is not implemented by the MPC601 . Execution of this instruction 
will invoke the illegal instruction handler. A description of the operation of this instruction 
is provided for emulation purposes. 

Let N be the contents of rB. The Nth TLB entry is made invalid (i.e., removed from the 
TLB). The TLB entry is invalidated regardless of the settings of MSR[IT] and MSR[DT]. 
If the Nth SLB does not exist, the results are implementation defined. 

This instruction is supervisor-level. 

This insti^uction is optional in PowerPC architecture. 

Other registers altered: 

None 

How software knows which TLB entry number is associated with which page table entry, 
or even how many TLB entries there are, is not specified in the architecture. This may differ 
ainong PowerPC processors. 

It is not necessary that the ASR point to a valid segment table when issuing tibiex. 
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tibsync 

TLB Synchronize 



Not Implemented in MPC601 



tibsync 



lil Reserved 



31 








566 


III 


lll|iiii|lll 


iiiiliiiiiiiii 


00000 



5 6 



10 11 



15 16 



20 21 



30 31 



The tibsync instruction waits until all previous tibie, tibiex, and tibia instructions executed 
by the processor executing this instruction have been received and completed by all other 
processors. 

This instruction is supervisor-level. 

This instruction is optional in PowerPC architecture, but it must be implemented if any of 
the following are true: 

• A TLB invalidation instruction that broadcasts is implemented. 

• The eciwx or ecowx instructions are implemented. 

Other registers altered: 
None 
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Appendix D 

Classes of Instructions 



This appendix describes how the classes of PowerPC instructions are defined.The three 
classifications are as follows: 

• Defined 

• Illegal 

• Reserved 

Note that while the definitions of these terms are consistent among the PowerPC 
processors, the assignment of these classifications is not. For example, an instruction that 
is specific to 64-bit implementations is considered defined for 64-bit implementations, but 
illegal for 32-bit implementations such as the MPC601 . 

D.I Classes of Instructions 

The MPC601 is a 32-bit implementation of the PowerPC architecture with differences and 
redefinitions noted throughout this document . Differences stem largely from the different 
address bus sizes and compliance with POWER architecture. 

All MPC601 instructions belong to one of the following three classes: 

• Defined 

• Illegal 

• Reserved 

The class is determined by examining the opcode and the extended opcode, if any. If the 
opcode, or combination of opcode and extended opcode, is not that of a defined instruction 
nor of a reserved instruction, the instruction is illegal. 

In future versions of the PowerPC architecture, instructions that are now illegal may 
become defined (by being added to the architecture) or reserved (by being assigned to one 
of the special purposes). Likewise, reserved instructions may become defined. 

D.1.1 Defined instruction Class 

Defined instructions are guaranteed to be supported in all PowerPC implementations, 
except as stated in the instruction descriptions in Chapter 10, "Instruction Set." The 
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MPC6()1 provides hardware support for most of the instructions defined for 32-bit 
implementations; it does not provide direct hardware support for the instructions listed in 
Appendix C, "PowerPC Instructions Not hnplemented in MPC6()1." 

The MPC601 invokes the system illegal instruction error handler (part of the program 
exception) when the unimplemented PowerPC instructions are encountered so they may be 
emulated in software, as required. 

A defined instruction can have invalid forms, as described in Section D. 1.1.1, "Invalid 
Instruction Forms." 

D.1.1.1 Invalid Instruction Forms 

An instruction form is invalid if one or more operands, excluding opcodes, are coded 
incorrectly. Attempting to execute an invalid form of an instruction either invokes the 
system illegal instruction error handler (a program exception) or yields undefined results. 
See Chapter 10, "Instruction Set," for individual instruction descriptions. 

Invalid forms result when a bit or operand is coded incorrectly, for example, or when a 
reserved bit is shown as "0" but is coded as a "1". The following instructions have invalid 
forms identified in their individual instruction descriptions: 

Branch conditional instructions 

Load/store with update instructions 

Load multiple insfi^uctions 

Load string instructions 

Move to/from special purpose register (mtspr, mfspr) 

Load/store floating-point with update instructions 

In some cases, an invalid form of a PowerPC instruction is not an invalid form for the 
corresponding POWER instruction. As a result, to maintain compatibility with POWER 
applications, the MPC601 often handles PowerPC invalid forms as described in the 
POWER architecture. In other cases, the MPC601 handles the invalid form in the manner 
that is most convenient for that particular case. Each of the PowerPC invalid forms are 
addressed in this document, and a description of how MPC6()1 handles each case is 
provided. 

D.I .2 Illegal Instruction Class 

Illegal instructions can be grouped into the following categories: 

• Instructions that are not implemented in the PowerPC architecture. These opcodes 
are available for future extensions of the PowerPC architecture; that is, future 
versions of the PowerPC architecture may define any of these instructions to 
perform new functions. The following primary opcodes are illegal: 

1,4,5,6,56,57,60,61 
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• Instructions that are implemented in the PowerPC architecture but are not 

implemented in a specific PowerPC implementation (for example, instructions that 
can be executed on 64-bit PowerPC processors are considered illegal for 32-bit 
processors. 

The following opcodes are defined for 64-bit implementations only and are illegal 
ontheMPC601: 

2, 30, 58, 62 

•The following primary opcodes have unused extended opcodes. Their unused 
extended opcodes can be determined from information in Section A.2, "PowerPC 
Instruction List Sorted by Opcode," and Section D.1.3, "Reserved Instructions." 
Notice that extended opcodes for instructions that are defined only for 64-bit 
implementations are illegal in 32-bit implementations. All unused extended 
opcodes are illegal. 

19, 31, 59, 63 (opcodes 30 and 62 are illegal for all 32-bit implementations, but as 
64-bit opcodes have some unused extended opcodes). 

An attempt to execute an illegal instruction invokes the illegal instruction error handler (a 
program exception) but has no other effect. See Section 5.4.7, "Program Exception 
(x'007(X)')," for additional information about illegal and invahd instruction exceptions. 

Note that an instruction consisting entirely of binary zeros is guaranteed to be an illegal 
instruction. This increases the probability that an attempt to execute data or uninitialized 
memory invokes the system illegal instruction error handler (a program exception). Note 
that if only the primary opcode consists of all zeros, the instruction is considered a reserved 
instruction, as described in Section D.1.3, "Reserved Instructions." 

With the exception of the instruction consisiting entirely of binary zeros, the illegal 
instructions are available for further additions to the PowerPC architecture. 

D.1.3 Reserved Instructions 

Reserved instructions are allocated to specific purposes outside the scope of the PowerPC 
architecture. An attempt to execute a reserved instruction either causes a program exception 
or yields undefined results. 

An attempt to execute a reserved instruction invokes the illegal instruction error handler (a 
program exception); however, the MPC601 executes many POWER architecture 
instructions that otherwise are not part of the PowerPC architecture. See Section 5.4.7, 
"Program Exception (x'00700')," for additional information about illegal and invalid 
instruction exceptions. 

The instructions in this class are allocated to specific purposes that are outside the scope of 
the PowerPC user instruction set architecture, PowerPC virtual environment architecture, 
and PowerPC operating environment architecture. 
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The following types of instructions are included in this class: 

1 . Instructions for the POWER architecture that have not been included in the 
PowerPC user instruction set architecture 

2. Implementation-specific instructions used to conform to the PowerPC 
architecture specifications 

3. The instruction with primary opcode 0, when the instruction does not consist 
entirely of binary zeros 

4. Any other implementation-specific instructions that are not defined in the PowerPC 
user instruction set architecture, PowerPC virtual environment architecture, or the 
PowerPC operating environment architecture 
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Appendix E 
Multiple-Precision Shifts 

This appendix gives examples of how multiple precision shifts can be programmed. A 
multiple-precision shift is initially defined to be a shift of an n-word quantity, where n>\. 
The quantity to be shifted is contained in n registers. The shift amount is specified either by 
an immediate value in the instruction or by bits 27-3 1 of a register. 

The examples shown below distinguish between the cases n = 2 and n>2Afn = 2, the shift 
amount may be in the range 0-63, which are the maximum ranges supported by the shift 
instructions used. However if n> 2, the shift amount must be in the range 0-3 1 , for the 
examples to yield the desired result. The specific instance shown for n > 2 is n = 3: 
extending those instruction sequences to larger n is straightforward, as is reducing them to 
the case n = 2 when the more stringent restriction on shift amount is met. For shifts with 
immediate shift amounts only the case w = 3 is shown, because the more stringent 
restriction on shift amount is always met. 

In the examples it is assumed that GPRs 2 and 3 (and 4) contain the quantity to be shifted, 
and that the result is to be placed into the same registers. In all cases, for both input and 
result, the lowest-numbered register contains the highest-order part of the data and highest- 
numbered register contains the lowest-order part. For non-immediate shifts, the shift 
amount is assumed to be in bits 27-3 1 (32-bit mode) of GPR6. For immediate shifts, the 
shift amount is assumed to be greater than 0. GPRs 0-31 are used as scratch registers. For 
n > 2, the number of instructions required is 2N-1 (immediate shifts) or 3N-1 (non- 
immediate shifts). 

E.1 Multiple-Precision Shift Examples 

The examples shown here are for 32-bit mode, but they work both in 32-bit mode of a 64- 
bit implementation and in a 32-bit implementation. They perform the shift in units of 
words. If the ability to run in 32-bit implementations is not required, in a 64-bit 
implementation better performance can be obtained in 32-bit mode than that of the 
examples shown above, by using all 64 bits of GPRs 2 and 3 (and 4) to contain the quantity 
to be shifted, and placing the result into all 64 bits of the same registers. 

Let n be the number of words to be shifted. 



MOTOROLA Appendix E. Multiple-Precision Shifts E-1 



Shift Left Immediate, n = 3 (Shift Amount < 32) 

rlwinm r2 , r2 , SH , , 3 1 -SH 
rlwimi r2 , r3 , SH, 32-SH, 31 
rlwinm r3 , r3 , SH, , 31-SH 
rlwimi r3 , r4 , SH, 32-SH, 31 
rlwinm r4 , r4 , SH, , 31-SH 

Shift Left, n = 2 (Shift Amount < 64) 

subf ic r31 , r6 , 32 

slw r2 , r2 , r6 

srw rO , r3 , r31 

or r2,r2,r0 

addic r31,r6,r6 

slw rO , r3 , r31 

or r2 , r2 , rO 

slw r3 , r3 , r6 

Shift Left, « = 3 (Shift Amount < 32) 

subfic r31,r6,32 

slw r2 , r2 , r6 

srw rO , r3 , r31 

or r2 , r2 , rO 

slw r3 , r3 , 6 

srw r0,r4,r31 

or r3 , r3 , rO 

slw r4 , r4 , r6 

Shift Right Immediate, n = 3 (Shift Amount < 32) 

rlwinm r4 , r4 , 32-SH, SH, 31 
rlwimi r4 , r3 , 32-SH, , SH-1 
rlwinm r3 , r3 , 32-SH, SH, 31 
rlwimi r3 , r2 , 32-SH, , SH-1 
rlwinm r2 , r2 , 32 -SH, SH, 31 

Shift Right, n = 2 (Shift Amount < 64) 

subfic r31 , r6 , 32 



srw 


r3 , r3 , r6 


slw 


rO, r2 , r31 


or 


r3,r3 , rO 


addic 


r31,r6,-32 


srw 


r0,r2,r31 


or 


r3 ,r3 , rO 


srw 


r2, r2 , r6 



Shift Right, n = 3 (Shift Amount < 32) 

subfic r31, r6 , 32 



srw 


r4. 


r4, 


r6 


slw 


rO, 


r2, 


, r31 


or 


r4, 


r4, 


, rO 


srw 


r31,r3,r6 


slw 


rO, 


r2, 


, r31 


or 


r3, 


r3, 


,rO 


srw 


r2. 


r2, 


, r6 
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Shift Right Algebraic Immediate, n = 3 (Shift Amount < 32) 

rlwinm r4 , r4 , 32-SH, SH, 31 
rlwimi r4 , rS , 32-SH, , SH-1 
rlwinm r3 , r3 , 32-SH, SH, 31 
rlwimi r3 , r2 , 32-SH, , SH-1 
srawi r2 , r2 , SH 

Shift Right Algebraic, n = 2 (Shift Amount < 64) 

subfic r31,r6,32 

srw r3 , r3 , r6 

slw r0,r2,r31 

or r3 , r3 , rO 

addic. r31, r6, -32 

sraw rO , r2 , r31 

ble $+8 

ori r3 , rO , 

sraw r2 , r2 , r6 

Shift Right Algebraic, n = 3 (Shift Amount < 32) 

subfic r31 , r6 , 32 

srw r4,r4,r6 

slw r0,r3,r31 

or r4 , r4 , rO 

srw r3 , r3 , r6 

slw r0,r2,r31 

or r3 , r3 , rO 

sraw r2 , r2 , r6 
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Appendix F 
Floating-Point Models 



This appendix gives examples of how the floating-point conversion instructions can be 
used to perform various conversions. 

R1 Conversion from Floating-Point Number to 
Signed Fixed-Point Integer Word 

The full convert to signed tixed-point integer word function can be implemented with the 
sequence shown below, assuming that the floating-point value to be converted is in FPRl, 
the result is returned in GPR3, and a double word at displacement "disp" from the address 
in GPRl can be used as scratch space. 

f ctiw[z] f 2 , f 1 #convert to fx int 

stfd f2,disp(rl) #store float 

Iwz r3 , disp+4 (rl) #load word and zero 

R2 Conversion from Floating-Point Number to 
Unsigned Fixed-Point Integer Word 

The full convert to unsigned fixed-point integer word function can be implemented with the 
sequence shown below, assuming that the floating-point value to be converted is in FPRl, 
the value is in FPRO, the value 2^^ j^ in FPR3, the value x'OOCX) 0(X)() 7FFF FFFF' is in 
FPR4, the value 2^^ is in FPR5 and GPR5, the result is returned in GPR3, and a double 
word at displacement "disp" from the address in GPRl can be used as scratch space. 



fmr 


f2, fO 


#use if < 


f cmpu 


cr2,fl,f0 




bl 


cr2 , store 




fmr 


f2, f4 


#use max if > max 


f cmpu 


cr2, fl, f3 




bgt 


cr2 , store 




f sub 


f2,fl,f5 


#subtract 2**31 


f cmpu 


cr2,fl,f5 


#use diff if > 2**31 


bnl 


cr2 , $+8 




fmr 


f2,fl 




fctiw[ 


z]f2, f2 


#convert to fx int store- 


stfd 


f2,disp(rl) 


#store float 


Iwz 


r3, disp+4 (rl) 


#load word 


bl 


cr2,$+8 


#add 2**31 if input 


add 


r3 , rS, r5 


#was > 2**31 
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F.3 Floating-Point Models 

This section describes models for floating-point instructions. 

F.3.1 Floating-Point Round to Single-Precision Model 

The following algorithm describes the operation of the Floating-Point Round to Single- 
Precision (frsp) instruction. 

If FRB[1-11]<:897 and FRB[l-63]:>0 then 

Do 

If FPSCR[UE]=0 then goto Disabled Exponent Underflow 
If FPSCR[UE]=1 then goto Enabled Exponent Underflow 

End 

If FRB[1-11]:>1150 and FRB [1-11 ] <:2047 then 

Do 

If FPSCR[OE]=0 then goto Disabled Exponent Overflow 
If FPSCR[0E]=1 then goto Enabled Exponent Overflow 

End 

If FRB [1-11] :>896 and FRB [1-11 ] <1151 then goto Normal Operand 

If FRB [1-63 ]=0 then goto Zero Operand 

If FRB [1-11] =2047 then 
Do 

If FRB [12-63 ]=0 then goto Infinity Operand 

If FRB[12]=1 then goto QNaN Operand 

If FRB[12]=0 and FRB [13-63 ] :>0 then goto SNaN Operand 
End 

Disabled Exponent Underflow: 

sign <— FRBO 
If FRB [1-11] =0 then 
Do 

exp <- -1022 

frac <- b'O' I I FRB[12-63] 
End 
If FRB [1-11] :>0 then 
Do 

exp «- FRB[1-11] - 1023 
frac ^ b'l' M FRB[12-63] 
End 
Denormalize operand: 
G I I R I I X <-b'000' 
Do while exp<-12 6 
exp <— exp + 1 

frac MG||R||X<-b'0' II frac jj G M (R I X) 
End 
FPSCR[UX] < frac[24-52] || G || R || X:>0 
If frac[24-52] || G || R || X:>0 then FPSCR[XX] <- 1 
Round single (sign, exp, f rac , G, R, X) 
If frac=0 then 
Do 

FRTOO <- sign 

FRT0_[l-63] <- 

If sign=0 then FPSCR[FPRF] <- "+zero" 

If sign=l then FPSCR[FPRF] <- "-zero" 
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End 
If fraoO then 
Do 

If frac[0]=l then 
Do 

If sign=0 then FPSCR[FPRF] «- "+normal number" 
If sign=l then FPSCR[FPRF] <- "-normal number" 
End 
If frac[0]=0 then 
Do 

If sign=0 then FPSCR[FPRF] <- " +denormalized number" 
If sign=l then FPSCR[FPRF] <— " -denormalized number" 
End 
Normalize operand- 
Do while frac[0]=0 
exp <r- exp-1 

frac II G II R ^ frac[l-52] I I G I I R I I b'O' 
End 
FRT[0] <- sign 
FRT[1-11] <- exp + 1023 

FRT[12-63] <r- frac[l-23] I I b'O 0000 0000 0000 0000 0000 0000 0000 
End 
Done 

Enabled Exponent Underflow 

FPSCR[UX] <- 1 
sign <- FRB[0] 
If FRB[1-11]=0 then 
Do 

exp <r- -102 2 

frac <- b'O' || FRB[12-53] 
End 
If FRB[1-11]:>0 then 
Do 

exp <- FRB[1-11] - 1023 
frac <- b'l' II FRB[12-63] 
End 
Normalize operand- 
Do while frac [0] =0 
exp <— exp - 1 
frac <- frac[l-52] I I b'O' 
End 
If frac[24-52]>0 then FPSCR[XX] <- 1 
Round single (sign, exp, frac ,0,0,0) 
exp <— exp + 192 
FRT[0] <- sign 
FRT[1-11] <- exp + 1023 

FRT[12-63] <- fracl-23 || b'O 0000 0000 0000 0000 0000 0000 0000' 
If sign=0 then FPSCR[FPRF] <- "+normal number" 
If sign=l then FPSCR[FPRF] <- "-normal number" 
Done 

Disabled Exponent Overflow 

inc <r- 

FPSCR[OX] <- 1 
FPSCR[XX] <- 1 

If FPSCR[RN]= b' 00' then/* Round to Nearest */ 
Do 

inc <— 
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If FRB[0]=0 then FRT <- x'VFFO 0000 0000 0000 

If FRB[0]=1 then FRT <-x'FFFO 0000 0000 0000' 

If FRB[0]=0 then FPSCR[FPRF] <- "+infinity" 

If FRB[0]=1 then FPSCR[FPRF] <- "-infinity" 
End 
If FPSCR[RN] = b' 01 ' then/* Round Truncate */ 
Do 

If (b'O' M FRB[l-63]) < x'047EF FFFF EOOO 0000' then inc <- 

If FRB[0]=0 then FRT <-x'47EF FFFF EOOO 0000' 

If FRB[0]=1 then FRT <-x'C7EF FFFF EOOO 0000' 

If FRB[0]=0 then FPSCR[FPRF] <r- "+normal number" 

If FRB[0]=1 then FPSCR[FPRF] <- "-normal number" 
End 
If FPSCR[RN]= b'lO' then /* Round to +Infinity */ 
Do 

If FRB[0]=0 then inc ^0 

If (FRB[0]=1 & (FRB > x'CVEF FFFF EOOO 0000' then inc <- 1) 

If FRB[0]=0 then FRT <r- x'VFFO 0000 0000 0000' 

If FRB[0]=1 then FRT <- X'CVEF FFFF EOOO 0000' 

If FRB[0]=0 then FPSCR[FPRF] <- "+infinity" 

If FRB[0]=1 then FPSCR[FPRF] <- "-normal number" 
End 
If FPSCR[RN] = b'll' then/* Round to -Infinity */ 
Do 

(If FRB[0]=0 & FRB < x'4VEF FFFF EOOO 0000') then inc <- 1 

If FRB[0]= 1 then inc <- 1 

If FRB[0]=0 then FRT <- x'4VEF FFFF EOOO 0000' 

If FRB[0]=1 then FRT <- x'FFFO 0000 0000 0000' 

If FRB[0]=0 then FPSCR[FPRF] <- "+normal number" 

If FRB[0]=1 then FPSCR[FPRF] <- "-infinity" 
End 
FPSCR[FR] <r- inc 
FPSCR[FI] <- 1 
Done 

Enabled Exponent Overflow 

sign <- FRB[0] 

exp <- FRB[1-11] - 1023 

frac <- b'l' M [12-63] 

If frac [24-52 ]:>0 then FPSCR[XX] <- 1 

Round single (sign , exp , frac ,0,0,0) 
Enabled Overflow 

FPSCR[OX] <- 1 

exp <— exp - 192 

FRT[0] <- sign 

FRT[1-11] <- exp + 1023 

FRT[12-63] <- frac[l-23] M b'O 0000 0000 0000 0000 0000 0000 0000 

If sign=0 then FPSCR[FPRF] <- " +normal number" 

If signal then FPSCR[FPRF] <r- "-normal number" 
Done 

Zero Operand 

FRT <- FRB 

If FRB[0]=0 then FPSCR[FPRF] <- "4-zero" 

If •FRB[0]=1 then FPSCR[FPRF] <r- "-zero" 

FPSCR[FR FI] ^ b' 00' 

Done 
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Infinity Operand 

FRT <r- FRB 

If FRB[0]=0 then FPSCR[FPRF] <- "+infinity" 

If FRB[0]=1 then FPSCR[FPRF] <r- "-infinity" Done 

QNaN Operand- 

FRT <- FRB[0-34] || b'O 0000 0000 0000 0000 0000 0000 0000' 

FPSCR[FPRF] <- "QNaN" 

FPSCR[FR FI] <- b' 00' 

Done 

QNaN Operand 

FRT <r- FRB[0-34] || b'O 0000 0000 0000 0000 0000 0000 0000' 
FPSCR[FPRF] <- "QNaN" 
FPSCR[FR FI] <- b' 00' 
Done 

SNaN Operand 

FPSCR[VXSNAN] <- 1 
If FPSCR[VE]=0 then 
Do 

FRT[0-11] <r- FRB[0-11] 

FRT [12] <r- 1 

FRT[13-63] <- FRB[13-34] | | b'O 0000 0000 0000 0000 0000 0000 
0000' 

FPSCR[FPRF] <- "QNaN" 
End 
FPSCR[FR FI] <- b' 00 ' 
DDone 

Normal Operand 

sign <- FRB[0] 

exp <- FRB[1-11] - 1023 

frac <r- b'l' II FRB[12-53] 

If frac[24-52]:>0 then FPSCR[XX] <- 1 

Round single (sign, exp, f rac , , , ) 

If exp:> + 127 and FPSCR[OE]=0 then go to Disabled Exponent Overflow 

If exp:>+127 and FPSCR[0E]=1 then go to Enabled Overflow 

FRT[0] <- sign 

FRT[1-11] <- exp + 1023 

FRT[12~63] <r- frac[l-23] || b'O 0000 0000 0000 0000 0000 0000 0000' 

If sign=0 then FPSCR[FPRF] <— "+normal number" 

If sign=l then FPSCR[FPRF] <- "-normal number" 

Done 

Round Single (sign,exp,frac,G,R,X) 

inc <r- 
Isb <- frac[23] 
gbit <- frac [24] 
rbit <-frac[25] 

xbit <- (frac[26-52] I |G| |R| |X)^0 
If FPSCR[RN] =b' 00' then 
Do 

If sign I I Isb | I gbit | | rbit | I xbit = b'ulluu' then inc <— 1 
If sign I I Isb | I gbit I | rbit | | xbit == b'uOllu' then inc ^1 
If sign I I Isb I I gbit | | rbit | | xbit - b'uOlul' then inc <- 1 
End 
If FPSCR[RN]= b'lO' then 
Do 
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If sign I I Isb I I gbit I I rbit I | xbit = b'Ouluu' then inc <- 1 
If sign I I Isb I I gbit I | rbit I | xbit = b'Ouulu' then inc <- 1 
If sign I I isb I I gbit | I rbit I | xbit = b'Ouuul' then inc <— 1 
End 
If FPSCR[RN]= b'll' then 
Do 

If sign I I isb I | gbit | I rbit I I xbit = b'luluu' then inc <- 1 
If sign I I Isb I I gbit | I rbit I I xbit = b'luulu' then inc <- 1 
If sign II isb I I gbit | | rbit I I xbit = b'luuul' then inc <- 1 
End 
frac[0-23] <- frac[0-23] + inc 
If carry_out=l then 
Do 

frac[0-23] <-b'l' || frac[0-22] 
exp <— exp + 1 
End 
FPSCR[FR] <- inc 

FPSCR[FI] <- gbit I rbit | xbit 
Return 

F.3.2 Floating-Point Convert to integer iVIodel 

The following algorithm describes the operation of the floating-point convert to integer 
instructions. In this example, u represents an undefined hexadecimal digit.. 

If Floating Convert to Integer Word 
Then Do 

Then round_mode <- FPSCR[RN] 
tgt_precision <- "32-bit integer" 
End 
If Floating Convert to Integer Word with round toward Zero 
Then Do 

round_mode <— b'Ol' 
tgt_precision <— "32-bit integer" 
End 
If Floating Convert to Integer Doubleword 
Then Do 

round_mode <- FPSCR[RN] 
tgt_precision <— "64-bit integer" 
End 
If Floating Convert to Integer Doubleword with round toward Zero 
Then Do 

round_mode <— b'Ol' 
tgt_precision <— "64-bit integer" 
End 
If FRB [1-11] =2047 and FRB [12-63 ]=0 then goto Infinity Operand 
If FRB [1-11] =2047 and FRB12=0 then goto SNaN Operand 
If FRB [1-11] =2047 and FRB12=1 then goto QNaN Operand 
If FRB [1-11] >1086 then goto Large Operand 

sign <- FRBO 

If FRB[1-11]>0 then exp <- FRB[1-11] - 1023 /* exp - bias */ 

If FRB[1-11]=0 then exp <- -1022 

If FRB[1-11]:>0 then frac [ 0-64 ] <-b' 01 ' I | FRB [12-63 ] I I b' 00000000000 ' 

/*normal*/ 

If FRB[1-11]=0 then f rac [ 0-64 ] ^b' 00 ' I | FRB [12-63 ] I | b ' 00000000 00 ' 

/*denormal*/ 
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gbit I I rbit 1 1 xbit <- b'OOO' 
Do i=l , 6 3-exp 

frac[0-64] I I gbit | | rbit I I xbit <- b ' ' II frac[0-64] I I gbit | | 
(rbit jxbit) 
End 

If gbit I rbit I xbit then FPSCR[XX] <- 1 

Round Integer (frac,gbit,rbit,xbit,roun(l_mode) 

In this example, u represents an undefined hexadecimal digit. Comparisons ignore the u 
bits. 

If sign=l then frac[0-64] < ^frac[0-64] + 1 

If tgt_precision="32-bit integer" and f rac [ 0-64 ] :> + 2 (31) -1 

then goto Large Operand 
If tgt_precision=" 64-bit integer" and f rac [ 0-64] >+2 (63 ) -1 

then goto Large Operand 
If tgt_precision=" 32-bit integer" and f rac [0-64] <- 2 (31) then goto Large 
Operand 

If tgt_precision=" 6 4 -bit integer" and frac [0-64] < -2 (63) then goto Large 
Operand 
If tgt_precision="32-bit integer" 

then FRT <— x'xuuuuuuu' 11 frac [3 3-6 4] 
If tgt_precision="64-bit integer" then FRT <— frac [1-64] 
FPSCR[FPRF] <- undefined 
Done 

Round Integer(frac,gbit,rbit,xbit,round_mode) 

In this example, u represents an undefined hexadecimal digit. Comparisons ignore the u 
bits. 

inc <- 

If round_mode= b'OO' then 
Do 

If sign I I frac [64] | | gbit I I rbit | | xbit = b'ullux' then inc <- 1 
If sign I I frac [64] | | gbit I I rbit | | xbit = b'uOllx' then inc <- 1 
If sign I I frac64 | | gbic | | rbit | | xbit = b'uOlul' then inc <— 1 
End 
If round_mode= b'lO' then 
Do 

If sign I I frac64 I I gbit I I rbit | | xbit = b'Oulux' then inc <- 1 
If sign I I frac64 | | gbit j | rbit | | xbit = b'Ouulx' then inc <— 1 
If sign I I frac64 I I gbit I | rbit | | xbit = b'Ouuul' then inc <— 1 
End 
If round_mode= b'll' then 
Do 

If sign I I frac64 | I gbit | I rbit I | xbit = b'lulux' then inc <— 1 
If sign I I frac64 | | gbit | | rbit I I xbit = b'luulx' then inc <— 1 

If sign I I frac64 | | gbit | | rbit 1 I xbit = b'luuul' then inc <— 1 
End 
frac[0-64] <r- frac[0-64] + inc 
FPSCR[FR] <- inc 
FPSCR[FI] <- gbit I rbit I xbit 
Return 



MOTOROLA Appendix F. Floating-Point IVlodels F-7 



Infinity Operand 

FPSCR[FR FI VXCVI] <- b'OOl' 
If FPSCR[VE]=0 then Do 

If tgt_precision="32 -bit integer" then 
Do 

If sign=0 then FRT 4- x'uuuu uuuu 7FFF FFFF ' 
If sign=l then FRT <- x'uuuu uuuu 8000 0000' 
End 
Else 
Do 

If sign=0 then FRT <- x'VFFF FFFF FFFF FFFF' 
If sign=l then FRT <- x'8000 0000 0000 0000' 
End 
FPSCR[FPRF] < undefined 
End 
Done 

SNaN Operand 

FPSCR[FR FI VXCVI VXSNAN] <- b'OOll' 
If FPSCR[VE]=0 then 
Do 

If tgt_precision= "32-bit integer" 

then FRT <- x'uuuu uuuu 8000 0000' 
If tgt_precision= " 64-bit integer" 

then FRT <- x'8000 0000 0000 0000' 
FPSCR[FPRF] <- undefined 
End 
Done 

QNaN Operand 

FPSCR[FR FI VXCVI] <- b'OOl' 
If FPSCR[VE]=0 then 
Do 

If tgt_precision="32-bit integer" then FRT <— x'uuuu uuuu 8000 
0000' 

If tgt_precision="64-bit integer" then FRT -e-x' 8000 0000 0000 0000' 
FPSCR[FPRF] < undefined 
End 
Done 

Large Operand 

FPSCR[FR FI VXCVI] <- b'OOl' 
If FPSCR[VE]=0 then Do 

If tgt_precision="32-bit integer" then 
Do 

If sign=0 then FRT <- x'uuuu uuuu 7FFF FFFF' 
If sign=l then FRT <- x'uuuu uuuu 8000 0000' 
End 
Else 
Do 

If sign=0 then FRT <- x'VFFF FFFF FFFF FFFF' 
If sign=l then FRT <- x'8000 0000 0000 0000' 
End 
FPSCR[FPRF] <- undefined 
End 
Done 
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F.4 Floating-Point Convert from Integer Model 

The following algorithm describes the operation of the floating-point convert from integer 
instructions. 

sign <- FRB[0] 
exp <— 6 3 
frac <- FRB 



If frac=0 then go to Zero Operand 
If sign=l then frac <— — ifrac + 1 



Do until frac[0]=l 

frac <- frac [1-63] | I 
exp <— exp - 1 

End 



b'O 



-normal number" 
+normal number" 



Round Float (sign,exp,frac,FPSCR[RN]) 

If sign=l then FPSCR[FPRF] <- 

If sign=0 then FPSCR[FPRF] <- 

FRT[0] <r- sign 

FRT[1-11] <- exp + 1023 /* exp + bias */ 

FRT[12-63] <- frac[l-52] 

Done 

Zero Operand 

FPSCR[FR FI] <- b' 00' 

FPSCR[FPRF] <- "+zero" 

FRT <- x'OGOO 0000 0000 0000' 

Done 

Round Float (sign,exp,frac,round_mode) 

In this example, the bits designated as u are ignored in comparisons. 



inc <— 
Isb <- frac[52] 
gbit <— frac[53] 
rbit <— frac [54] 
xbit <- frac [55-53] >0 
If round_mode=b' 00 ' then 
Do 

If sign I I Isb | I gbit I I rbit 
If sign I I Isb I I gbit | | rbit 
If sign I I Isb | I gbit I 1 rbit 
End 
If round_mode= b'lO' then 
Do 

If sign 1 I Isb | | gbit | | rbit 
If sign I I Isb | | gbit I 1 rbit 
If sign I I Isb | | gbit | | rbit 
End 
If round_mode= b'll' then 
Do 

If sign I I Isb | I gbit | I rbit 
If sign I I Isb I I gbit I | rbit 
If sign I I Isb | | gbit | | rbit 



xbit = b'ulluu' then inc <— 1 
xbit = b'uOllu' then inc <— 1 
I xbit = b'uOlul' then inc <— 1 



xbit = b'Ouluu' then inc <— 1 
I xbit = b'Ouulu' then inc <— 1 
xbit = b'Ouuul' then inc <— 1 



xbit - b'luluu' then inc <— 1 
I I xbit = b'luulu' then inc <— 1 
xbit = b'luuiil' then inc <— 1 
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End 
frac[0-52] <- frac[0-52] + inc 
If carry_o-ut = l then exp <— exp + 1 
FPSCR[FR] <r- inc 

FPSCR[FI] <- gbit | rbit | xbit 
If (gbit I rbit | xbit) then FPSCR[XX] <- 
Return 
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Appendix G 

Synchronization Programming 

Examples 

The examples in this appendix show how synchronization instructions can be used to 
emulate various synchronization primitives and how to provide more complex forms of 
synchronization. 

For each of these examples, it is assumed that a similar sequence of instructions is used by 
all processes requiring synchronization of the accessed data. 

G.1 General Information 

The following points provide general information about the Iwarx and stwcx. instructions: 

• In general, Iwarx and stwcx. instructions should be paired, with the same effective 
address used for both. The exception is an isolated stwcx. instruction that is used to 
clear any existing reservation on the processor, for which there is no paired Iwarx 
and for which any (scratch) effective address can be used. 

• It is acceptable to execute an Iwarx instruction for which no stwcx. instruction is 
executed. For example, such a dangling Iwarx instruction occurs if the value loaded 
in the Test and Set sequence shown Section G.2.5, "Test and Set," is not zero. 

• To increase the likelihood that forward progress is made, it is important that looping 
on Iwarx/stwcx, pairs be minimized. For example, in the sequence shown above for 
Test and Set, this is achieved by testing the old value before attempting the store — 
were the order reversed, more stwcx. instructions might be executed, and 
reservations might more often be lost between the Iwary and the stwcx. instructions. 

• The manner in which Iwarx and stwcx. are communicated to other processors and 
mechanisms, and between levels of the memory subsystem within a given processor 
is implementation-dependent. In some implementations performance may be 
improved by minimizing looping on a Iwarx instruction that fails to return a desired 
value. For example, in the Test and Set example shown above, if the programmer 
wishes to stay in the loop until the word loaded is zero, he could change the bne S+ 
1 2 to bne loop. However, in some implementations better performance may be 
obtained by using an ordinary Load instruction to do the initial checking of the 
value, as follows: 
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loop: Iwz rS,0(r3) #load the word 

cmpwi r5 , #loop back if word 

bne loop #not equal to 

Iwarx rS,0,r3#try again, reserving 

cmpwi r5 , # (likely to succeed) 

bne loop #try to store nonzero 

stwcx. r4,0,r3#loop if lost reservation 

bne loop 

• In a multiprocessor, livelock is possible if a loop containing an Iwarx/stwcx. pair 
also contains an ordinary Store instruction for which any byte of the affected 
memory area is in the reservation granule of the reservation. For example, the first 
code sequence shown in Section D. 1 .2, List Insertion, can cause livelock if two list 
elements have next element pointers in the same reservation granule. 

G.2 Synchronization Primitives 

The following examples show how the Iwarx and stwcx. instructions can be used to 
emulate various synchronization primitives. The sequences used to emulate the various 
primitives consist primarily of a loop using Iwarx and stwcx.. Additional synchronization 
is unnecessary, because the stwcx. will fail, clearing the EQ bit, if the word loaded by Iwarx 
has changed before the stwcx. is executed. 

G.2.1 Fetch and No-Op 

The Fetch and No-Op primitive atomically loads the current value in a word in memory. In 
this example it is assumed that the address of the word to be loaded is in GPRS and the data 
loaded are returned in GPR4. 

loop: Iwarx r4,0,r3#load and reserve 

ctwcx. rd ~_ #store old value if still reserved 
bne loop #loop if lost reservation 

Notes: 

1. Because stwcx. is not necessarily performed with respect to all other mechanisms 
that access memory, an ordinary load instruction, or even a Load and Reserve 
instruction, on a different processor, may return a stale value. However, a 
subsequent Iwarx on the other processor followed by a successful stwcx. on that 
processor is guaranteed to have returned the value stored by the first processor's 
stwcx. (in the absence of other stores to the location). 

2. The storing done by the stwcx. instruction in this example is redundant. 

G.2.2 Fetch and Store 

The Fetch and Store primitive atomically loads and replaces a word in memory. 

In this example it is assumed that the address of the word to be loaded and replaced is in 
GPRS, the new value is in GPR4, and the old value is returned in GPR5. 
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loop: Iwarx r5,0,r3#load and reserve 

stwcx. r4, , r3 #store new value if still reserved 
bne loop #loop if lost reservation 

G.2.3 Fetch and Add 

The Fetch and Add primitive atomically increments a word in memory. 

In this example it is assumed that the address of the word to be incremented is in GPR3, the 
increment is in GPR4, and the old value is returned in GPR5. 

loop: Iwarx rS,0,r3 #load and reserve 

add ra,r4,rS #increment word 

stwcx. ra,0,r3 #store new value if still reserved 

bne loop #loop if lost reservation 

G.2.4 Fetch and AND 

The Fetch and AND primitive atomically ANDs a value into a word in memory. 

In this example it is assumed that the address of the word to be ANDed is in GPR3, the 
value to AND into it is in GPR4, and the old value is returned in GPR5. 

loop: Iwarx rS,0,r3 #load and reserve 

and ra,r4,rS #AND word 

stwcx. ra,0,r3 #store new value if still reserved 

bne loop #loop if lost reservation 

Note: This sequence can be changed to perform another Boolean operation atomically on 
a word in memory, simply by changing the AND instruction to the desired Boolean 
instruction (OR, XOR, etc.). 

G.2.5 Test and Set 

The Test and Set primitive atomically loads a word from memory, ensures that the word in 
memory contains a non-zero value, and sets the EQ bit of CR Field according to whether 
the value loaded is zero. 

In this example it is assumed that the address of the word to be tested is in GPR3, the new 
value (non-zero) is in GPR4, and the old value is returned in GPR5. 

loop: Iwarx rS , 3 , r3 #load and reserve 
cmpwi rS, a #done if word 
bne $+12 #not equal to 
stwcx. r4,0,r3#try to store nonzero 
bne loop #loop if lost reservation 
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Notes: 

1. Test and Set is shown primarily for pedagogical reasons. It is useful on machines 
that lack the better synchronization facilities provided by Iwarx and stwcx.. Test and 
Set does not scale well. Using Test and Set before a critical section allows only one 
process to execute in the critical section at a time. Using Iwarx and stwcx. to bracket 
the critical section allows many processes to execute in the critical section at once, 
but at most one will succeed in exiting from the section with its results stored. 

2. Depending on the application, if Test and Set fails (that is, clears the EQ bit of CR 
Field 0) it may be appropriate to re-execute the Test and Set. 

G.3 Compare and Swap 

The Compare and Swap primitive atomically compares a value in a register with a word in 
memory, if they are surely equal stores the value from a second register into the word in 
memory, if they may be unequal loads the word from memory into the first register, and sets 
the EQ bit of CR Field to indicate the result of the comparison. 

In this example it is assumed that the address of the word to be tested is in GPRS, the 
comparand is in GPR4, the new value is in GPRS, and the old value is returned in GPR6. 

Iwarx r6,0,r3#load and reserve 

cmpw r4,r6 #first 2 operands equal ? 

bne $+8 #skip if not 

stwcx. rS, 0, r3 #store new value if still reserved 

Notes: 

1 . Compare and Swap is shown primarily for pedagogical reasons. It is useful on 
machines that lack the better synchronization facilities provided by Iwarx and 
stwcx.. A major weakness of typical Compare and Swap instructions is that they 
permit spurious success if the word being tested has changed and then changed back 
to its old value: the sequence shown above does not have this weakness. 

2. Depending on the application, if Compare and Swap fails (that is, clears the EQ bit 
of CRO) it may be appropriate to recompute the value potentially to be stored and 
then re-execute the Compare and Swap. 

G.4 List Insertion 

The following example shows how the Iwarx and stwcx. instructions can be used to 
implement simple LIFO (last-in-first-out) insertion into a singly-linked list. (Complicated 
list insertion, in which multiple values must be changed atomically, or in which the correct 
order of insertion depends on the contents of the elements, cannot be implemented in the 
manner shown below, and requires a more complicated strategy such as using locks.) 
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The next element pointer from the list element after which the new element is to be inserted, 
here called the parent element, is stored into the new element, so that the new element 
points to the next element in the list: this store is performed unconditionally. Then the 
address of the new element is conditionally stored into the parent element, thereby adding 
the new element to the list. 

In this example it is assumed that the address of the parent element is in GPR3, the address 
of the new element is in GPR4, and the next element pointer is at offset O from the start of 
the element. It is also assumed that the next element pointer of each list element is in a 
reservation granule separate from that of the next element pointer of all other list elements. 

loop: Iwarx r2,0,r3#get next pointer 

stw r2 , (r4) #store in new element 

sync #let store settle (can omit if not MP) 

stwcx. r4, a, r3#add new element to list 

bne loop #loop if stwcx. failed 

In the preceding example, if two list elements have next element pointers in the same 
reservation granule then, in a multiprocessor, livelock can occur, (Livelock is a state in 
which processors interact in a way such that no processor makes progress.) 

If it is not possible to allocate list elements such that each element's next element pointer is 
in a different reservation granule, then livelock can be avoided by using the following, more 
complicated, code sequence. 

Iwz r2,0(r3)#get next pointer 

loopl: mr r5,r2 #keep a copy 

stw r2 , (r4) #store in new element 

sync #let store settle 

loop2 : lwarxrZ,0,r3 #get it again 

cmpw r2 , r5 #loop if changed (someone 

bne loopl #else progressed) 

stwcx. r4,0,r3#add new element to list 

bne loop2 #loop if failed 
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Glossary of Terms and Abbreviations 

The glossary contains an alphabetical list of terms, phrases, and abbreviations used in this 
book. Some of the terms and definitions included in the glossary are reprinted from IEEE 
Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic, copyright ©1985 by 
the Institute of Electrical and Electronics Engineers, Inc. with the permission of the IEEE. 



A Atomic. A bus access that attempts to be part of a read-write operation to the 

saine address uninterrupted by any other access to that address (the 
term refers to the fact that the transactions are indivisible). The 
MPC601 initiates the read and write separately, but signals the 
memory system that it is attempting an atomic operation. If the 
operation fails, status is kept so that the MPC601 can try again. The 
MPC6()1 implements atomic accesses through the Iwarx/stwcx. 
instruction pair, which asserts the TTO signal. 

B Beat. A single state on the MPC60 1 interface that may extend across multiple 

bus cycles. An MPC601 transaction can be composed of multiple 
address or data heats. 

Biased Exponent. The sum of the exponent and a constant (bias) chosen to 
make the biased exponent's range non-negative. 

Big-Endian. A byte-ordering method in memory where the address n of a 
word corresponds to the most significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 0, 1 , 2, 3, with 
being the most significant byte. 

Boundedly Undefined. The results of attempting to execute a given 

instruction are said to be boundedly undefined if they could have 
been achieved by executing an arbitrary sequence of defined 
instructions, in valid form, starting in the state the machine was in 
before attempting to execute the given instruction. Boundedly 
undefined results for a given instruction may vary between 
implementations, and between execution attempts in the same 
implementation. 
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Branch Folding. A technique of removing the branch instruction from the 
instruction sequence. 

Burst. A multiple beat data transfer whose total size is typically equal to a 
cache block (in the MPC6()1: a 32-byte sector). 

Bus Clock. Clock that causes the bus state transitions 

Bus Master. The owner of the address or data bus; the device that initiates or 
requests the transaction. 



C Cache. High-speed memory containing recently accessed data and/or 

instructions (subset of main memory). 

Cache Block. The cacheable unit for a PowerPC processor. The size of a 

cache block may vary among processors. For the MPC601 , it is one 
sector (8 words). 

Cache Coherency. Caches are coherent if a processor performing a read 

from its cache is supplied with data corresponding to the most recent 
value written to memory or to another processor's cache. 

Cast-Outs. Cache sectors that must be written to memory when a snoop miss 
causes the least recently used section with modified data to be 
replaced. 

Context Synchronization. All instructions in execution complete past the 
point where they can produce an exception; all instrucdons in 
execution complete in the context in which they began execudon; all 
subsequent instructions are fetched and executed in the new context. 

Copy-Back Operations. A cache operation in which a cache line is copied 
back to memory to enforce cache coherency. Copy-back operations 
consist of snoop push-out operations and cache cast-out operations. 

D Denormalized Number, A non-zero floating-point number whose exponent 

has a reserved value, usually the format's minimum, and whose 
expHcit or implicit leading significand bit is zero. 

Dynamic Store Forwarding. Allows the FPU to collapse a floating-point 
arithmetic operation followed by a floating-point store operation that 
depends on the result of the arithmetic operation into a single 
operation through the pipeline. 
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]g Exception, An unusual or error condition encountered by the processor that 

results in special processing. 

Exception Handler. A software routine that executes when an exception 
occurs. Normally, the exception handler corrects the condition that 
caused the exception, or perforins some other meaningful task (such 
as aborting the program that caused the exception). The addresses of 
the exception handlers are defined by a two-word exception vector 
that is branched to automatically when an exception occurs. 

Exclusive State. MESI state in which only one caching device contains data 
that is also in system memory. Note that in the MPC6()1, shared 
cache sectors are also described as shared exclusive, in that data is 
the same in both the cache and in external memory. 

Execution Synchronization. All instructions in execution are architecturally 
complete before beginning execution (appearing to begin execution) 
of the next instruction. Similar to context synchronization but doesn't 
force the contents of the instruction buffers to be deleted and 
refetched. 

Exponent. The component of a binary floating-point number that normally 
signifies the integer power to which two is raised in determining the 
value of the represented number. Occasionally the exponent is called 
the signed or unbiased exponent. 

p Feed Forwarding. An MPC601 feature that reduces the number of clock 

cycles that an execution unit must wait to use a register. When the 
source register of the current instruction is the same as the 
destination register of the previous instruction, the result of the 
previous instruction is routed to the current instruction at the same 
time that it is written to the register file. With feed forwarding, the 
destination bus is gated to the waiting execution unit over the 
appropriate source bus, saving the cycles which would be used for 
the write and read, 

Floating-Point Unit. The functional unit in the MPC601 processor 

responsible for executing all floating-point instructions plus integer 
multiply and divided instructions. 

Flush. An operation that causes a modified cache sector to be invalidated and 
the data to be written to memory. 

Fraction. The field of the significand that lies to the right of its implied binary 
point. 
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G General -Purpose Registers. Any of the 32 registers in the MPC601 register 

file. These registers provide the source operands and destination 
results for all MPC6()1 data manipulation instructions. Load 
instructions move data from memory to registers, and store 
instructions move data from registers to memory. 

J IEEE 754. A standard written by the Institute of Electrical and Electronics 

Engineers that defines operations of binary floating-point arithmetic 
and representations of binary floating-point numbers. 

Instruction Unit. The functional unit in the MPC601 processor that fetches 
all instructions from memory and performs the initial stages of 
instruction decoding. The instruction unit also contains the branch 
processing unit and performs all instruction address calculations 
(including branch address calculations). 

Integer Unit. The functional unit in the MPC6()1 processor responsible for 
executing all instructions except floating point, integer multiply and 
divide, and change of flow instructions. 

Interrupt. An external signal that causes the MPC6()1 to suspend current 
execution and take a predefined exception. 

Invalid State. MESI state (I) that indicates that the cache sector does not 
contain valid data. 

K Kill. An operation that causes a cache sector to be invalidated. 

L Latency. The number of clock cycles necessary to execute an instruction and 

make ready the results of that instruction. 

Little-Endian. A byte-ordering method in memory where the address n of a 
word corresponds to the least significant byte. In an addressed 
memory word, the bytes are ordered (left to right) 3, 2, 1, 0, with 3 
being the most significant byte. 

Livelock. Astate in which processors interact in a way such that no processor 
makes progress. 

M Memory-Mapped Accesses. Accesses whose addresses use the segmented 

or block address translation mechanisms provided by the MMU and 
that occur externally with the bus protocol defined for memory. 
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Memory Coherency. Refers to memory agreement between caches in a 
multiple processor and system memory (e.g. MESI cache 
coherency). 

Memory Consistency. Refers to levels of memory with respect to a single 
processor and system memory (e.g. on-chip cache, secondary cache, 
and system memory). 

Memory-Forced I/O Controller Interface Access(BUID = x'07F'). These 
accesses are made to memory space. They do not use the extensions 
to the memory protocol described for I/O controller interface 
accesses, and they bypass the page- and block -translation and 
protection mechanisms. 

MESI Modified. MESI state in which one, and only one, caching device has 
the valid data for that address. The data at this address in external 
memory is not valid. 

N NaN. Not a number; a symbolic entity encoded in floating-point format. 

There are two types of NaNs — signaling NaNs and quiet NaNs. 

No-Op. No-operation. A single-cycle operation that does not affect registers 
or generate bus activity. 

O Overflow. An error condition that occurs during arithmetic operations when 

the result cannot be stored accurately in the destination register(s). 
For example, if two 32-bit numbers are added, the sum may require 
33 bits due to carry. Since the 32-bit registers of the MPC601 cannot 
represent this sum, an overflow condition occurs. 



Packet. It is used in the MPC601 with respect to I/O controller interface 
operations. 

Page. A4-Kbyte area of memory, aligned on a 4-Kbyte boundary. 

Park. The act of allowing a bus master to maintain mastership of the bus 
without having to arbitrate. 

Pipelining. A technique that breaks instruction execution into distinct steps 
so that multiple steps can be performed at the same time. 

Precise Exceptions. The pipeline can be stopped so the instructions that 
preceded the faulting instruction can complete, and subsequent 
instructions can be executed from scratch. The system is precise 
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unless one of the imprecise modes for invoicing the floating-point 
enabled exception is in effect. 

Processor Clock. Internal P_CLOCK signal. 



Q Quiesce. To come to rest. The processor is said to quiesce when an exception 

is taken or a sync instruction is executed. The instruction stream is 
stopped at the decode stage and executing instructions are allowed to 
complete to create a controlled context for instructions that may be 
affected by out-of-order, parallel execution. See Synchronization. 

Quiet NaNs. Propagate through almost every arithmetic operation without 
signaling exceptions. These are used to represent the results of 
certain invalid operations, such as invalid arithmetic operations on 
infinities or on NaNs, when invalid. 

R Register Renaming. The use of shadowing that allows a register to be 

updated by instructions that are executed out of order without 
destroying machine state information. The MPC6()1 implements a 
two-entry link register shadow to improve performance and to help 
handle the precise exception model required by PowerPC. 

S Scan Interface. The MPC601 's test interface. 

Sector. One half of a MPC601 cache line. Each MPC601 cache line is 16 
words long; therefore, each sector is 8 words long. Cache coherency 
is maintained with sector granularity. In the MPC601, the sector is 
equivalent to a cache block. 

Shared State. MESI protocol state in which two or more caching devices 
contain the same information. In the MPC601 , shared implies 
shared, exclusive. That is, shared data is identical to the data at that 
address in system memory. 

Signaling NaNs. Signal the invalid operation exception when they are 
specified as arithmetic operands 

Signlficand. The component of a binary floating-point number that consists 
of an explicit or implicit leading bit to the left of its implied binary 
point and a fraction field to the right. 

Slave. The device addressed by a master device. The slave is identified in the 
address tenure and is responsible for supplying or latching the 
requested data for the master during the data tenure. 
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Snooping. Monitoring addresses driven by a bus master to detect the need for 
coherency actions. 

Snoop Push. Write-backs due to a snoop hit. The sector may or may not 
transition to shared state. 

Split-Transaction. A transaction with independent request and response 
tenures. 

Split-Transaction Bus. A bus that allows address and data transactions from 
different processors to occur independently. 

Static Branch Prediction. Mechanism by which software (for example, 
compilers) can give a hint to the machine hardware about the 
direction the branch is likely to take. 

Superscalar Machine. A machine that can issue multiple instructions 
concurrently from a conventional linear instruction stream. 

Supervisor Mode. The privileged operation state of the MPC6()1 . In 

supervisor mode, software can access all control registers and can 
access the supervisor memory space, among other privileged 
operations. 

Tenure. The period of bus mastership. For the MPC6()1 , there can be separate 
address bus tenures and data bus tenures. A tenure consists of three 
phases: arbitration, transfer, termination 

Transaction. A complete exchange between two bus devices. A transaction 
is minimally comprised of an address tenure; one or more data 
tenures may be involved in the exchange. There are two kinds of 
transactions: address/data and address-only. 

Transfer Termination. Signal that refers to both signals that acknowledge 
the transfer of individual beats (of both single-beat transfer and 
individual beats of a burst transfer) and to signals that mark the end 
of the tenure. 



U Underflow. An error condition that occurs during arithmetic operations when 

the result cannot be represented accurately in the destination register. 
For example, underflow can happen if two floating-point fractions 
are multiplied and the result is a single-precision number. The result 
may require a larger exponent and/or mantissa than the single- 
precision format makes available. In other words, the result is too 
sinall to be represented accurately. 
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Unified Cache. Combined data and instruction cache. 

User Mode. The unprivileged operating state of the MPC601 . In user mode, 
software can only access certain control registers and can only 
access user memory space. No privileged operations can be 
performed. 



\/Y Write-Through. A memory update policy in which all processor write cycles 

are written to both the cache and memory. 
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A0-A3 1,8-7 
MCK,8-16 
W5, 8-5, 9-8 
abs, 10-7 
add, 10-8 
addc, 10-9 
adde, 10-10 
addi,3-92, 10-11 
addic, 10-12, 10-13 
addis, 3-92, 10-14 
addme, 10-15 
Address bus 

address tenure, 9-7, 9-37 
address transfer 

A0-A31, 8-7 

AP0-AP3, 8-8 

5^,8-9 

signals, 9-12 
address transfer attribute 

U, 8-14 

CSE0-CSE2 signals, 8-15 

m:, 8-15 

HP_SNP_REQ, 8-16 

TB5T, 8-13, 9-15 

TCO-TCl, 8-14, 9-17 

TSIZ0-TSIZ2, 8-12, 9-14 

TT0-TT4, 8-10, 9-13 

WT, 8-15 
address transfer start 

T5,8-6 

7ST5, 8-6 
address transfer termination 

MCK, 8-16 

SRTR7, 8-17 

SHD, 8-18 

terminating address transfer, 9-18 
arbitration signals, 9-8 
bus arbitration, 9-10 

"m, 8-5 

BS, 8-4 
BR, 8-4 
Address calculation 

branch instructions, 3-63 
Address translation, see Memory management unit 
Addressing 

branch conditional relative, 3-64 
branch conditional to absolute, 3-66 



branch conditional to count register, 3-66 

branch conditional to link register, 3-66 

branch relative, 3-64 

branch to absolute, 3-65 

immediate index, floating-point, 3-55 

register indirect with immediate index, integer, 3- 
42 

register indirect, floating-point, 3-56 

register indirect, integer, 3-44 
addze, 10-16 

Aligned data transfer, 2-43, 9-15 
Alignment 

exception, 5-25, 6-13 

rules, 2-44, 2^7 
and, 10-17 
andc, 10-18 
andi, 10-19 
andis., 10-20 
AP0-AP3, 8-8 
7PE, 8-9 

Arbitration, system bus, 9-10, 9-21 
y^RTRY, 4-17, 8-17 

Asynclironous exceptions, 5-6, 5-7, 5-9 
Atomic memory references 

stwcx., 3-53, 10-197 

using Iwarx/stwcx., 4-14 

B 

b, 10-21 

be, 10-22 

bcctr, 10-23 

BCLKJN. 8-32 

bclr, 10-24 

m, 8-4, 9-8 

BI operand, 3-69 

Big-endian byte ordering, 2-45 

Block address translation 
BAT registers, 2-33, 6-26 
block address translation flow, 6-8, 6-30 
block memory protection, 6-19, 6-29, 6-31 
block size options, 6-28 
BTLB organization, 6-24 
generation of physical addresses, 6-29 
selection of block address translation, 6-5, 6-26 

BO operand encodings, 3-68, B-3 

BR, 8-4, 9-8 

Branch folding, 1-6,7-15 
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Branch instructions 

address calculation, 3-63 

condition register logical, 3-75 

description, 3-74 

simplified branch mnemonics, 3-69 

simplified mnemonics, 3-77 
Branch prediction, 7-15 
Branch processing unit 

execution timing, 7-14 

overview, 1-6 
Breakpoints 

breakpoint control, 6-12 

data breakpoints, 6-12, 6-13 

instruction breakpoints, 6-12, 6-13 
Burst transfers 

transfers with data delays, timing, 9-34 
Bus unit ID (BUID), 9-36 
Byte ordering 

default, 2-44 

endian selection, 2-36 



Cache arbitration, 4-3, 7-6 
Cache cast-out operation, 4-4 
Cache coherency 

actions on load operations, 4-13 

actions on store operations, 4-14 

bus interface logic, 4-25 

cache control instructions, 4-17 

cache snoop, 4-14 

coherency precautions, 4-1 1 

copy-back operation, 6-17 

in multiprocessor systems, 4-12 

in single-processor systems, 4-12 

overview, 4-1,4-6 

reaction to bus operations, 4-14 

WIM bits, 4-7, 6-10, 6-16, 6-57, 6-61 

write-back mode, 6-17 
Cache control instructions 

bus operations, 4-21 

clcs, 3-88, 4-18, 10-25 

dcbf, 3-89, 4-20, 10-39 

dcbst, 3-88, 4-19, 10-41 

debt, 3-87, 4-18, 10-42 

dcbtst, 3-87,4-19, 10-43 

dcbz, 3-88, 4-19, 10-44 

eieio, 4-20, 10-54 

icbi, 3-87,4-21,10-77 

isync,4-21, 10-78 

purpose, 4-17 
Cache hit, 7-7 
Cache miss, 7-8 
Cache operations 

cache cast-out operation, 4-4 



cache data transactions, 4-5 

cache sector line-fill operation, 4-5 

cache sector push operation, 4-5, 4-17 

overview, 1-9,4-1 

response to bus transactions, 4-14 
Cache organization, 4-2 
Cache sector line-fill operation, 4-5 
Cache sector push operation, 4-5, 4-17 
Cache unit 

memory performance, 7-25 

operation of the cache, 9-2 

overview, 4-1 
Cache-inhibited accesses (I bit) 

cache interactions, 4-7 

MMU (I-bit setting), 6-10. 6-16, 6-57, 6-61 

timing considerations, 7-26 
Change (C) bit maintenance 

recording, 6-8, 6-39, 6-40, 6-41 

updates, 6-56 
Checkstop signal, 9-47 

Checkstop sources and enables register (HIDO), 2-36 
Checkstop state, 5-21 
UI, 8-14 
CKSTPJN, 8-25 
CKSTP_OUT, 8-26 
clcs, 4-18, 10-25 
Clean block operation, 4-15 
Clock signals 

2X_FCL K, 8-3 1 

BCLKJN, 8-32 

PCLK_EN,8-31 

RTC, 8-35 
cmp, 10-26 
cmpi, 10-27 
cmpl, 10-28 
cmpli, 10-29 
cntlzd, C-4 
cntlzw, 10-30 

Coherency precautions, 4-11 
Complement register, simplified mnemonic, 3-93 
Context synchronization, 2-24, 3-2 
Copy-back mode, 7-25 
CR (condition register) 

CR bit fields, 2-11 

CR settings, 3-39, B-2 
crand, 10-31 
crandc, 10-32 
creqv, 10-33 
crnand, 10-34 
crnor, 10-35 
cror, 10-36 
crorc, 10-37 
crxor, 10-38 
CSE0-CSE2 signals, 8-15, 9-27 
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CTR (count register), 2-19 



DABR (data address breakpoint register, HIU5), 2-40 

DAR (data address register), 2-27 

Data access exception, 5-21 

Data breakpoints (DABR), 2-40, 6-12, 6-13 

Data bus 

arbitration signals, 8-19, 9-8 

bus arbitration, 9-21 

data tenure, 9-7, 9-37 

data transfer, 8-21,9-22 

data transfer termination, 8-23, 9-23 
Data transfers, alignment, 2-43, 9-15, 9-17 
UBB, 8-20, 9-8, 9-22 
UBg, 8 -19, 9-8 
PBWg, 8- 19, 9-8, 9-50 

dcbf, 4-20, 10-39 

dcbi, 10-40 

dcbst,4-19, 10-41 

debt, 4-18, 10-42 

dcbtst,4-19, 10-43 

dcbz, 4-19, 10-44 

Debug modes register (HIDl), 2-38 

DEC (decrementer register), 2-28, B-7 

Decode timing, 7-10 

Decrementer exception, 5-45 

Defined instruction class, D-1 

DH0-DH31/DL0-DL31, 8-21 

Direct address translation (translation disabled) 

data accesses, 6-7, 6-8, 6-16, 6-24, 6-34 

instruction accesses, 6-7, 6-8, 6-16, 6-24, 6-34 
div, 10-45 
divd, C-4 
divdu, C-5 
divs, 10-46 
divw, 10-47 
divwu, 10-49 

Double-speed processor clock (2X_PCLK), 8-3 1 
doz, 10-50 
dozi, 10-51 
DP0-DP7, 8-22 
PFE, 8 -23 

T5RTR7, 8-24, 9-23, 9-25 
DSISR (DAE/source instruction service register) 

format, 2-27 

settings for alignment exception, 5-30 

settings for DAE, 5-21 



EAR (external access register), 2-3 1 
eciwx, 10-52 



ecowx, 10-53 

Effective address calculation 
address translation, 6-1 
branches, 3-2, 3-63 
loads and stores, 3-2, 3-42, 3-55 

eieio, 3-53, 4-20, 10-54 

eqv, 10-55 

Error termination, 9-25 

ESP interface, 8-29 

Exceptions 

alignment exception, 5-25 
asynclironous exceptions, 5-6, 5-7 
data access exception, 5-21 
decrementer exception, 5-45 
enabling and disabling, 5- 1 3 
exception classes, 5-2 
exception priorities, 5-7 
exception processing, 5-9, 5-13 
external interrupt, 5-25 
FP unavailable exception, 5-44 
I/() controller interface enor, 5-46 
instruction access exception, 5-24 
machine check exception, 5-19 
precise exceptions, 5-5 
priorities, 5-10 
program exception, 5-32 
register settings 
FPSCR, 5-33 
MSR, 5-11,5-15 
SRRO, SRR1,5-10 
reset, 5-16 

run mode exception, 5-48 
summary, 3-3 
summary table, 5-2 
synchronous/precise, 5-6 
system call exception, 5-47 
vector offset table, 5-2, 5-16 

Execute timing, instruction, 7-12 

Execution units, 1-6 

External control instructions, 3-90 

extsb, 10-56 

extsh, 10-57 

extsw, C-6 

F 

fabs, 10-58 
fadd, 10-59 
fcfid, C-7 
fcmpo, 10-60 
fcmpu, 10-61 
fetid, C-7 
fctidz, C-8 
fctiw, 10-62 
fctiwz, 10-63 
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fdiv, 10-64 

Features, MPC601, 1-2, 1-13 
Feed forwarding, 7-3 
Floating-point instructions 

data formats, 2-59 

floating-point models, F-2 

IEEE-754 compatability, 2-55 

precision handling, 2-66 

rounding, 2-68 
Floating-point model 

compare instructions, 3-39 

FP arithmetic instructions, 3-30 

FP exception mode bits, 5-12 

FP multiply-add instructions, 3-34 

FP unavailable exception, 5-44 

FPSCR instructions, 3-40 

program exceptions, 5-33 

rounding and conversion instructions, 3-37 
Floating-point numbers, conversion, F-1 
Floating-point unit 

execution timing, 7-22 

overview, 1-7 
Flow control instructions, 3-63 

branch instructions, 3-74 

condition register logical, 3-75 

system linkage, 3-76 
Flush block operation, 4-15 
fmadd, 10-65 
fmr, 10-66 
fmsub, 10-67 
fmul, 10-68 
fnabs, 10-69 
fneg, 10-70 
fnmadd, 10-71 
fnmsub, 10-73 

FPCC (floating-point condition code), 3-39 
FPR0-FPR31 (floating-point registers), 2-6 
FPSCR (floating-point status and control register), 2-7 
FPSCR instructions, 3-40 
fres, C-9 
frsp, 10-75 
frsqrte, C-10 
fsel, C-1 1 
fsqrt, C-Il 
fsub, 10-76 

G 

m:, 8-15 

GPR0-GPR31 (general purpose registers), 2-6 
Guarded memory, 6-12 

H 

Hardware consideratons, MESI, 4-10 



Hashed page tables, 6-41 
Hashing functions 

primary PTEG, 6-45. 6-52, 6-55 

secondary PTEG, 6-45, 6-53, 6-56 
HID register s, 2-35 
HP_SNP_ RECi, 8-16 
HRESET, 8-26 

I 

I/O controller interface 

address translation, 6-59 

alignment exception, 5-27 

architectural ramifications of accesses, 9-36 

bus protocol 

address and data tenures, 9-37 
detailed description, 9-41 
load access, timing, 9-46 
load operations, 9-40 
store access, timing, 9-47 
store operations, 9-39 
transactions, 9-38 
WS signal, 9-37 
I/O controller interface error exception, 5-46 
memory-forced accesses, 6-61 
no-op instructions, 6-62 
operations, 8-8 
protection, 6-60 
selection of I/O controller interface segments, 6- 

35 
unsupported instructions, 6-61 
I/O tenures, 9-38 
lABR (instruction address breakpoint register, HID2), 

2-39 
icbi, 4-2 1,10-77 

IEEE 1149.1-compatible interface, 9-49' 
Illegal instruction class, D-2 
Imprecise exceptions, 5-7, 5-9 
Instruction 

stmw, 10-192 
Instruction access exception, 5-24 
Instruction breakpoints (lABR), 2-39, 6-12, 6-13 
Instruction flow, 7-4 
Instruction prefetch 

MMU constraints, 6-1 1 
Instruction queue, 1-6, 7-4 
Instruction stages, 7-5 
Instruction timing 

instruction flow, 7-4 
instruction queue, 7-4 
instruction stages, 7-5 
overview, 7-1 
timing considerations, 7-2 
Instruction TLB (ITLB), 6-15 
Instruction unit, 1-5 
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Instructions 
abs, 10-7 
add, 10-8 
addc. 10-9 
adde, 10-10 
addi.3-92, 10-11 
addic, 10-12, 10-13 
addis, 3-92, 10-14 
addme, 10-15 
addze, 10-16 
and, 10-17 
andc, 10-18 
andi, 10-19 
andis., 10-20 
b, 10-21 
be, 10-22 
bcctr, 10-23 
bclr, 10-24 

branch address calculation, 3-63 
branch instructions, 3-74 
cache management instructions, 4-17 
classes of instructions, D-1 
clcs,4-18, 10-25 
cmp, 10-26 
cmpi, 10-27 
cmpl, 10-28 
cmpli, 10-29 
cntlzd, C-4 
cntlzw, 10-30 

condition register logical, 3-75 
crand, 10-3 i 
crandc, 10-32 
creqv, 10-33 
crnand, 10-34 
crnor, 10-35 
cror, 10-36 
crorc, 10-37 
crxor, 10-38 
dcbf, 4-20, 10-39 
dcbi, 10^0 
dcbst,4-19, 10-41 
debt, 4-18, 10-42 
dcbtst,4-19, 10-43 
dcbz,4-19, 10-44 
defined instructions, D-1 
div, 10-45 
divd, C-4 
divdu, C-5 
divs, 10-46 
divw, 10-47 
divwu, 10-49 
doz, 10-50 
dozi, 10-5 1 
eciwx, 3-90, 10-52 
ecowx, 3-90, 10-53 



eieio, 3-53, 4-20, 10-54 

cciv. 10-55 

external control, 3-90 

extsb, 10-56 

extsh, 10-57 

extsw, C-6 

fabs, 10-58 

fadd, 10-59 

fefid, C-7 

fcmpo, 10-60 

fcmpu, 10-61 

fetid, C-7 

fctidz, C-8 

fctiw, 10-62 

fetiwz, 10-63 

fdiv, 10-64 

floating-point 

arithmetic, 3-30 

compare, 3-30, 3-39 

double-precision conversion, load, 3-59 

double-precision conversion, store, 3-6 1 

FP status and control register, 3-40 

multiply-add, 3-30, 3-34 

rounding and conversion, 3-30, 3-37 

status and control register, 3-30 
floating-point models, F-2 
flow control, 3-1, 3-63 
fmadd, 10-65 
fmr, 10-66 
fmsub, 10-67 
fmul, 10-68 
fnabs, 10-69 
fneg, 10-70 
fnmadd, 10-71 
fnmsub, 10-73 
fres, C-9 
frsp, 10-75 
frsqrte,C-10 
fsel,C-ll 
fsqrt.C-ll 
fsub, 10-76 
icbi,4-21, 10-77 
illegal instructions, D-2 
integer 

arithmetic, 3-4 

compare, 3-4, 3-15 

logical, 3-4, 3-16 

rotate, 3-20 

rotate and shift, 3-4, 3-18, 3-19 

shift, 3-20 
invalid forms, D-2 
isync, 3-53, 4-21, 10-78 
latency summary, 7-26 
Ibz, 10-79 
Ibzu, 10-80 
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Ibzux, 10-81 
Ibzx, 10-82 
ld,C-13 
Idarx, C-13 
Idu, C-14 
Idux, C-15 
Idx,C-15 
Ifd, 10-83 
Ifdu, 10-84 
Ifdux, 10-85 
Ifdx, 10-86 
Ifs, 10-87 
Ifsu, 10-88 
Ifsux, 10-89 
Ifsx, 10-90 
Iha, 10-91 
Ihau, 10-92 
lliaux, 10-93 
Ihax, 10-94 
Ihbrx, 10-95 
Ihz, 10-96 
Ihzu, 10-97 
Ihzux, 10-98 
Ihzx, 10-99 
Imw, 10-100, B-4 
load and store 

address generation, floating-point, 3-55 

address generation, integer, 3-42 

byte reversal instructions, 3-48 

double-precision conversion for FP load, 3- 
59 

double-precision conversion for FP store, 3- 
61 

floating-point load, 3-57 

floating-point move, 3-62 

floating-point store, 3-60 

integer load, 3-44 

integer multiple, 3-49 

integer store, 3-47 

move multiple, 3-50 
Iscbx, 10-101 
Iswi, 10-103, B^ 
Iswx, 10-104, B-4 
lwa,C-16 

lwarx,3-53, 10-105 
Iwaux, C-16 
lwax,C-17 
Iwbrx, 10-106 
Iwz, 10-107 
Iwzu, 10-108 
Iwzux, 10-109 
Iwzx, 10-110 
maskg, 10-111 
maskir, 10-112 



mcrf, 10-113 

mcrfs, 10-1 14 

mcrxr, 10-115 

memory control, 3-85 

mfcr, 10-117 

mffs, 10-118 

mfmsr, 10-119, B-1 

mfspr,3-80, 10-120, B-5 

mfsr, 10-123, B-1 

mfsrin, 10-124 

mftb, C-17 

mtcrf, 10-125 

mtfsbO, 10-126 

mtfsbl, 10-127 

mtfsf, 10-128 

mtmsr, 10-130 

mtspr,3-80, 10-131, B-5 

mtsr, 10-133 

mtsrin, 10-134 

mul, 10-135 

mulhd, C-18 

mulhdu, C-19 

mulhw, 10-136 

mulhwu, 10-137 

mulld,C-19 

muUi, 10-139 

muUw, 10-138 

nabs, 10-140 

nand, 10-141 

neg, 10-142 

no-op, 3-92 

nor, 10-143 

or, 10-144 

ore, 10-145 

ori, 10-146 

oris, 10-147 

POWER instructions in PowerPC, B-9 

POWER instructions, deleted, B-8 

PowerPC instructions, list, A-1 

processor control, 3-1, 3-80 

reserved bits, B-1 

reserved instructions, D-3 

rfi, 10-148 

rldcl, C-20 

rider, C-21 

rIdic,C-21 

rldicl, C-22 

rldicr, C-23 

rldimi, C-23 

rlmi, 10-149 

rlwimi, 10-150 

rlwinm, 10-151 

rlwnm, 10-152 

rrib, 10-153 
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sc, 10-154, B^ 

segment register instructions, B-6 

segment register manipulation, 3-89 

slbia, C-24 

slbie, C-25 

slbiex, C-26 

sld, C-26 

sle, 10-156 

sleq, 10-157 

sliq, 10-158 

sUiq, 10-159 

sUq, 10-160 

siq, 10-161 

slw, 10-162 

srad, C-27 

sradi, C-28 

sraiq, 10-164 

sraq, 10-163 

sraw, 10-165 

srawi, 10-166 

srd, C-29 

sre, 10-167 

srea, 10-168 

sreq, 10-169 

srliq, 10-171 

srlq, 10-172 

srq, 10-173 

srw, 10-174 

stb, 10-175 

stbu, 10-176 

stbux, 10-177 

stbx, 10-178 

std, C-29 

stdcx., C-30 

stdu,C-31 

stdux,C-31 

stdx, C-32 

stfd, 10-179 

stfdu, 10-180 

stfdux, 10-181 

stfdx, 10-182 

stfiwx, C-32 

stfs, 10-183 

stfsu, 10-184 

stfsux, 10-185 

stfsx, 10-186 

sth, 10-187 

sthbrx, 10-188 

sthu, 10-189 

sthux, 10-190 

sthx, 10-191 

stswi, 10-193 

stswx, 10-194 

stw, 10-195 

stwbrx, 10-196 



stwcx., 3-53, 10-197 

stwu, 10-198 

stwux, 10-199 

stwx, 10-200 

subf, 10-201 

subfc, 10-202 

subfe, 10-203 

subfic, 10-204 

subfme, 10-205 

subfze, 10-206 

supervisor-level cache management, 3-85 

support for Iwarx/stwcx., 9-48 

sync, 3-53, 10-207 

td, C-33 

tdi, C-34 

TLB management, 3-90 

tibia, C-34 

tlbie, 3-90, 10-208, B-7 

tibiex, C-35 

tlbsync, C-36 

trap, 3-78 

tw, 10-210 

twi, 10-211 

unimplemented by MPC601, 32-bit, C-1 

unimplemented by MPC601, 64-bit, C-2 

word compare mnemonics, 3-15 

xor, 10-212 

xori, 10-213 

xoris, 10-214 
TRT, 8-25, 9^7 

Integer aritlimetic instructions, 3-4 
Integer compare instructions, 3-15 
Integer load instructions, 3-44 
Integer logical instructions, 3-16 
Integer rotate and shift instructions, 3-18, 3-19 
Integer store instructions, 3-47 
Integer unit 

execution timing, 7-19 

overview, 1-7 
Interrupt, external, 5-25 
isync, 3-53, 4-21, 10-78 
IU,3^ 

K 

Key (Ks, Ku) protection bits, 6-19 
Kill block operation, 4-15 



Latency, 7-1, 7-26, 9-22 
Ibz, 10-79 
Ibzu, 10-80 
Ibzux, 10-81 
Ibzx, 10-82 
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ld,C-13 

Idarx,C-13 

ldu,C-14 

ldux,C-15 

ldx,C-15 

Ifd, 10-83 

Ifdu, 10-84 

Ifdux, 10-85 

Ifdx, 10-86 

Ifs, 10-87 

Ifsu, 10-88 

Ifsux, 10-89 

Ifsx, 10-90 

Iha, 10-91 

Ihau, 10-92 

Ihaux, 10-93 

Ihax, 10-94 

Ihbrx, 10-95 

Ihz, 10-96 

Ihzu, 10-97 

Ihzux, 10-98 

Ihzx, 10-99 

Little-endian byte ordering, 2-45 

Imw, 10-100, B-4 

Load address, simplified mnemonic, 3-93 

Load immediate, simplified mnemonic, 3-92 

Load operations 

I/O load accesses, 9-40 
memory coherency actions, 4-13 

Load/store 

address generation, 3-42 
byte reverse instructions, 3-48 
floating-point load instructions, 3-57 
floating-point move instructions, 3-62 
floating-point store instructions, 3-60 
integer load instructions, 3-44 
integer store instructions, 3-47 
load/store multiple instruction, 3-49 
memory synchronization instructions, 3-53 
move multiple instructions, 3-50 

Logical addresses 

translation into physical addresses, 6-1 

LR (link register), 2-18 

Iscbx, 10-101 

Iswi, 10-103, B^ 

Iswx, 10-104, B-4 

lwa,C-16 

Iwarx, 3-53, 10-105 

Iwarx/stwcx. 

general information, 4-14, G-I 
support, 9-48 

Iwaux, C-16 

Iwax, C-17 

Iwbrx, 10-106 



Iwz, 10-107 
Iwzu, 10-108 
Iwzux, 10-109 
Iwzx, 10-110 

M 

Machine check exception, 2-36, 5-19 

maskg, 10-111 

maskir, 10-112 

mcrf, 10-113 

mcrfs, 10-114 

mcrxr, 10-115 

Memory accesses, 9-4 

Memory coherency bit (M bit) 

MMU (M-bit setting), 6-16 

cache interactions, 4-7 

coherency in multiprocessor systems, 4-12 

MMU (M-bit setting), 6-10, 6-57, 6-61 

timing considerations, 7-25 
Memory control instructions 

cache management, 3-85, 3-86 

segment register manipulation, 3-89 

TLB management, 3-90 
Memory management unit 

address translation flow, 6-8 

address translation mechanisms, 6-5, 6-21 

block address translation, 6-5, 6-8, 6-24 

block diagram, 6-3 

direct address translation, 6-7, 6-8, 6-16, 6-24, 6- 
34 

exceptions, 6-12 

hashing functions, 6-45 

instruction TLB (ITLB), 6-15 

instructions and registers, 6-14 

memory protection, 6-7, 6-19, 6-31 

memory/cache access modes (WIM bits), 6-10 

overview, 1-8, 6 2 

page address translation, 6-5, 6-8, 6-35, 6-41 

page history status, 6-8, 6-39 

page table search operation, 6-53 

page tables in memory, 6-41 

segment model, 6-31 

virtual address (52-bit), 6-35 
Memory synchronization 

eieio, 3-53 

isync, 3-53 

Iwarx, 3-53 

stwcx., 3-53 

sync, 3-53 
Memory unit 

bus interface logic, 4-25 

operation for loads and stores, 9-4 

over/iew, 1-9,4-22 
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queuing priorities, 4-24 
queuing structure, 4-14, 4-24 

Memory update modes 
copy-back mode, 7-25 

Memory/cache access modes, purpose, 7-25 

Memory/cache access modes, see WIM bits 

MESI protocol 

definition, MESI states, 1-21, 4-8 
enforcing memory coherency, 9-27 
hardware considerations, 4-10 

MESI state definitions, 4-8 

mfcr, 10-117 

mffs, 10-118 

mfmsr, 10-119,B-1 

mfspr, 3-80, 10-120 

mfspr, POWER and PowerPC, B-5 

mfsr, 10-123, B-1 

mfsrin, 10-124 

mftb,C-17 

Misaligned data transfer, 9-17 

Move register, simplified mnemonic, 3-93 

MQ register, 2-14, 3-4 

MSR (machine state register), 2-20 

mtcrf, 10-125 

mtfsbO, 10-126 

mtfsbl, 10-127 

mtfsf, 10-128 

mtmsr, 10-130 

mtspr, 3-80, 10-131 

mtspr, POWER and PowerPC, B-5 

mtsr, 10-133 

mtsrin, 10-134 

mul, 10-135 

mulhd, C-18 

mulhdu, C-19 

mulhw, 10-136 

mulhwu, 10-137 

mulld, C-19 

mulli, 10-139 

muUw, 10-138 

Multiple-precision shifts, 3-20, E-1 

N 

nabs, 10-140 

nand, 10-141 

neg, 10-142 

No-op, 3-92 

nor, 10-143 

Normal termination, 9-23 



Operand placement and performance, 2-42 
Operating environment architecture, 1-11 



or, 10-144 

ore, 10-145 

on, 10-146 

oris, 10-147 

Out-of-order instruction issue, 7-13 



Page address translation 

generation of physical addresses, 6-35 

page address translation flow, 6-41 

page memory protecfion, 6-19, 6-40 

page size, 6-31 

page tables in memory, 6-41 

segment registers, 6-33, 6-36 

selection of page address translation, 6-5, 6-34 

table search operation, 6-53 

UTLB organizafion, 6-33 

virtual address and virtual segment ID, 6-35 
Page history status 

R and C bit recording, 6-8, 6-39 

R and C bit updates, 6-56 
Page tables 

allocation of PTEs, 6-49 

example table structures, 6-49, 6-5 1 

organized as PTEGs, 6-43 

page tiible size, 6-44 

page table updates, 6-56 

PTE format, 6-37 

PTEG addresses, 6-47, 6-51 

table search for PTE, 6-53 
PCLK_EN,8-31 

Performance considerations, memory, 7-25 
Physical address generation 

block physical address generation, 6-29 

generation of PTEG addresses, 6-47, 6-51 

memory management unit, 6-1 

page physical address generation, 6-35 
PIR (processor identificadon register, HID 15), 2-41 
POWER architecture 

deleted instructions in PowerPC, B-8 

migraton to PowerPC, B-1 

POWER instructions in PowerPC, B-9 

POWER/PowerPC, incompatibilifies, B-1 

svcx instruction, B-4 
PowerPC architecture 

features used in MPC601, 1-13 

instructions, A-1 

levels of implementation, 1-11 

operating environment architecture, 1-11 

POWER/PowerPC, incompatibilities, B-1 

registersAmplementation, 1-14 

user instruction set architecture, 1-11 

virtual environment architecture, 1-11 
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PP protection bits, 6-19 
Precise exceptions, 5-5, 5-9 
Prefetch timing, 7-6 
Priorities 

caciie access priorities, 4-4 

exception priorities, 5-7, 5-10 

memory unit queuing priorities, 4-24 
Privilege levels 

clianging privilege levels, 2-20, 5-14 

supervisor-level cache instruction, 3-85 

supervisor-level registers, 2-20 

user-level cache instructions, 3-86 

user-level registers, 2-6 
Process switching, 5-14 
Processor control instructions, 3-80 
Program exception, 5-32 
Programming model 

supervisor-level registers, 2-20 

user-level registers, 2-6 
Protection of memory areas 

block access protection, 6-19, 6-29, 6-3 1 

I/O controller interface protection, 6-8, 6-6 1 

options available, 6-7, 6-19 

page access protection, 6-19, 6-31, 6-40 

programming protection bits, 6-19 

protection violations, 6-12, 6-20, 6-31 
PTEGs (PTE groups) 

definition, 6-43 

example primary and secondary PTEGs, 6-5 1 

generation of PTEG addresses, 6-47 

table search operation, 6-53 
PTEs (page table entries), 6-35, 6-37, 6-43, 6-53, 6-56 
PVR (processor version register), 2-33 

Q 

Qualified bus grant, 9-8 
Qualified data bus grant, 9-21 
Qualified snoop request, 4-14 
Queuing structure, memory unit, 4-24 
QUIESC_REQ, 8-28 



Read (with clean) operations, 4-13 

Read atomic operation, 4-15 

Read operation, 4-15 

Read with intent to modify operation, 4-15 

Real-time clock (RTC), B-7 

Reference (R) bit maintenance 

recording, 6-8, 6-39, 6-40, 6-54 

updates, 6-56 
Registers 

DEC, B-7 

PowerPC implementation, 1-14 



reserved bits, B-2 
supervisor-level 
MSR, 2-20 
SR, 2-22 
supervisor-level SPRs 
BATs, 2-33 
DAR, 2-27 
DEC, 2-28 
DSISR, 2-27 
EAR, 2-31 
HID Registers, 2-35 
PVR, 2-33 
SDRl, 2-29 
SPRG0-SPRG3, 2-31 
SRRO, 2-30 
SRRl, 2-30 
user-level 
CR, 2-11 

FPR0-FPR3 1,2-6 
FPSCR, 2-7 
GPR0-GPR31, 2-6 
user-level SPRs 
CTR, 2-19 
LR, 2-18 
MQ, 2-14 
RTC, 2-16 
XER, 2-15 
Reserved instruction class, D-3 
Reset 

hard reset, 2-71,5-17 
register state after reset, 2-71 
reset exception, 5-16 
softreset, 2-72, 5-17 
Reset signals 

HRE5ET, 8-26, 9-48 
QUIESC_REQ, 8-28 
RESUME, 8-27 
T^SHV, 8-28 
SC_DR IVE, 8-28 
SRESET, 8-27, 9-48 
SYS.QUIESC, 8-27 
RESUME, 8-27 
rfi, 10-148 
ridel, C-20 
rider, C-21 
ridic, C-21 
ridicl, C-22 
ridicr, C-23 
ridimi, C-23 
rlmi, 10-149 
riwimi, 10-150 
rlwinm, 10-151 
rlwnm, 10-152 
Rotate and shift operations, 3-18 
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Rounding, floating-point operations, 2-68 
rrib, 10-153 
FiSRV, 8-28, 9-49 
RTC (real time clock) 

RTC facility, 2-16 

signal, 8-35 
Run mode exception, 5-48 



sc, 5-47, 10-154 

SC_DRIVE, 8-28 

SDRl (table search description) register 

format, 2-29, 6-44 

generation of PTEG addresses, 6-47, 6-51 
Segment registers 

format, 2-22, 6-36, 6-60 

instructions, 6-37, B-6 

SR manipulation instructions, 3-89 

T-bit, 2-22, 6-34, 9-35 

updates, 2-24 
Segmented memory model, see Memory management 

unit 
SFfD, 8-18 
Signals 

2X_PCLK,8-31 

A0-A3 1,8-7 

MUR, 8-16 

WS, 8-5, 9-8 

address arbitration, 9-8 

address transfer, 9-12 

address transfer attribute, 9-13 

AP0-AP3, 8-8 

WE, 8- 9 

ARTRY, 8-17, 9-23 

BCLKJN, 8-32 

BS, 8-4, 9-8 

BR, 8-4, 9-8 

checkstop, 9-47 

U, 8-14 

CKSTPJN, 8-25 

CKSTP_OUT, 8-26 

configuration, 8-2 

CSE0-CSE2, 8-15, 9-27 

data arbitration, 9-8, 9-21 

data transfer termination, 9-23 

UBB, 8-20, 9-8, 9-22 

UB(?,8 -19, 9-8 

UBWU, 8-19, 9-8, 9-50 

DH0-DH31/DL0-DL31, 8-21 

UP0-DP7, 8-22 

DFE, 8- 23 

URTRY, 8-24, 9-23, 9-25 

ESP interface, 8-29 

(SHI, 8-15 



HP_SNP_ REg,8-16 

HRESET, 8-26 

INT, 8-25, 9-47 

PfiLK_EN,8-31 

QUIESC_REQ, 8-28 

reset, 9-48 

RESUME, 8-27 

R5RV, 8-28, 9-49 

RTC rreal time clock), 8-35 

SC_DRIVE, 8-28 

5HU, 8-18 

soft stop control, 9-48 

SRESET, 8-2 7, 9-48 

SYS_QUIESC, 8-27 

TA, 8-23, 9-23 

TEST, 8-13, 9-22 

TCO-TCl, 8-14, 9-17 

TEA, 8-24, 9-23, 9-25 

T5,8-6 

TSIZ0-TSIZ2, 8-12, 9-14 

TT0-TT4, 8-10, 9-13 

WT, 8-15 

7AT5, 8-6, 9-37 
Simplified mnemonics, 3-91 
Single-beat reads with data delays, timing, 9-32 
Single-beat transfer 

back-to-back, timing, 9-33 

reads with data delays, timing, 9-3 1 

reads, timing, 9-29 

termination, 9-23 

writes, timing, 9-30 
slbia, C-24 
slbie, C-25 
slbiex, C-26 
sld, C-26 
sle, 10-156 
sleq, 10-157 
sUq, 10-158 
sUiq, 10-159 
sUq, 10-160 
slq, 10-161 
slw, 10-162 

Snoop operation, 4-14, 7-25, 9-19 
Snoop status signals, 4-14 
Soft stop control signals, 9-48 
Split-bus transaction, 9-9 

SPR encodings, unimplemented in MPC601, C-1, C-3 
SPRG0-SPRG3 (general SPRs), 2-31 
SR (segment register), 2-22 
srad, C-27 
sradi, C-28 
sraiq, 10-164 
sraq, 10-163 
sraw, 10-165 
srawi, 10-166 
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srd, C-29 

sre, 10-167 

srea, 10-168 

sreq, 10 -169 

SRESET, 8-27 

srliq, 10-171 

srlq, 10-172 

srq, 10-173 

SRRO/SRRl (status save/restore registers), 2-30 

srw, 10-174 

Static branch prediction, 7-15 

stb, 10-175 

stbu, 10-176 

stbux, 10-177 

stbx, 10-178 

std, C-29 

stdcx., C-30 

stdu, C-31 

stdux, C-31 

stdx, C-32 

stfd, 10-179 

stfdu, 10-180 

stfdux, 10-181 

stfdx, 10-182 

stfiwx, C-32 

stfs, 10-183 

stfsu, 10-184 

stfsux, 10-185 

stfsx, 10-186 

sth, 10-187 

sthbrx, 10-188 

sthu, 10-189 

sthux. 10-190 

sthx, 10-191 

stmw, 10-192 

Store operations 

I/O operations to BUC, 9-39 
memory coherency actions, 4-14 
single-beat writes, 9-30 

stswi, 10-193 

stswx, 10-194 

stw, 10-195 

stwbrx, 10-196 

stwcx., 3-53, 10-197 

stwu, 10-198 

stwux, 10-199 

stwx, 10-200 

subf, 10-201 

subfc, 10-202 

subfe, 10-203 

subfic, 10-204 

subfme, 10-205 

subfze, 10-206 

Supervisor mode, see privilege levels 



sync, 3-53, 10-207 
sync operation, 4-15 
Synclironization 

context, 2-24 

memory synchronization instructions, 3-53 

sync, B-5 
Synclironous/precise exceptions, 5-6 
SYS_QUIESC, 8-27 
System call exception, 5-47 
System linkage instructions, 3-76 
System status 

CKSTPJN, 8-25 

CK5TP_0 UT, 8-26 

HRESET, 8-26 

TFTT, 8-25 

QUIESC_REQ, 8-28 

RESUME, 8-27 

RSRV, 8-28 

SC_DR IVE, 8-28 

SRESET, 8-27 

SVS_ClUIESC, 8-27 

T 

TK, 8-23, 9-23 

Table search operations 

algorithm, 6-53 

hashing functions, 6-45 

page table definition, 6-43 

SDRl register, 6-44 

table search flow (primary and secondary), 6-54 
TB5T, 8-13, 9-22 
TCO-TCl signals, 8-14, 9-17 
td, C-33 
tdi, C-34 
TEA, 8-24, 9-25 
Termination, 9-18, 9-23 
Test signals, 8-30 
Timer facilities, B-7 
Timing diagrams, interface 

address transfer signals, 9-12 

back -to-back single-beat transfers, 9-33 

burst transfers with data delays, 9-34 

I/O controller interface load access, 9-46 

I/O controller interface store access, 9-47 

single-beat reads, 9-29 

single-beat reads with data delays, 9-3 1 

single-beat writes, 9-30 

single-beat writes with data delays, 9-32 

use o f 1^ 9 -35 

using UBTO, 9^ 
Timing, instruction 

BPU execution timing, 7-14 

cache arbitration, 7-6 

cache hit, 7-7 
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cache miss, 7-8 

decode timing, 7-10 

FPU execution timing, 7-22 

instruction execute timing, 7-12 

lU execution timing, 7-19 

prefetch timing, 7-6 

writeback timing, 7-14 
TLB invalidate 

TLB invalidate broadcast operations, 6-14, 6-15, 
6-57 

TLB management instructions, 3-90 

tlbie instruction, 6-14, 6-15, 6-57 
tibia, C-34 
tlbie, 3-90, 10-208 
tlbie, POWER and PowerPC, B-7 
tlbiex, C-35 
tlbsync, C-36 

tlbsync instruction emulation, 6-57 
TO operand, 3-78 
Transactions, cache, 4-5 
Transfer, 9-12, 9-22 
Trap instructions, 3-78 
TS, 8-6, 9-12 

TSIZ0-TSIZ2 signals, 8-12, 9-14 
TT0-TT4, 8-10, 9-13 
tw, 10-210 
twi, 10-211 



Write-through mode (W bit) 

cache interactions, 4-7 

MMU (W-bit setting), 6-16, 6-57, 6-61 

timing considerations, 7-26 
Write-with-flush operations, 4-13 
WT, 8-15 



JK[5 signal, 8-6, 9-37 

XER (integer exception register), 2-15 

xor, 10-212 

xori, 10-213 

xoris, 10-214 



u 

Unimplemented instructions in MPC601, C-1 

UseofTEA, timing, 9-35 

User instruction set architecture, 1-1 1 

User mode, see privilege levels 

Using DBWcI, timing, 9-50 

UTLB, 6-33, 6-39 

V 

Vector offset table, exception, 5-2, 5-16 
Virtual address (52-bit) 

logical to virtual to physical address translation, 
6-35 
Virtual environment architecture, 1-1 1 
Virtual memory implementation, 6-2 



w 

WIM bits, 4-7, 6-10, 6-16, 6-57, 6-61, 9-27 
Word compare mnemonics, 3-15 
Write with atomic operation, 4-15 
Write with flush operation, 4-i5 
Write with kill operation, 4-15 
Write-back mode, 6-17 
Writeback timing, 7-14 
Write-through (W bit), 6-10 



MOTOROLA 



Index 



lndex-13 



Overview 
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Addressing Modes and Instruction Set Summary 
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Exceptions 
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Instruction Timing 
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